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Abstract 



Network information theory is the study of communication problems involving mul- 
tiple senders, multiple receivers and intermediate relay stations. The purpose of this 
thesis is to extend the main ideas of classical network information theory to the study 
of classical-quantum channels. We prove coding theorems for the following commu- 
nication problems: quantum multiple access channels, quantum interference channels, 
quantum broadcast channels and quantum relay channels. 

A quantum model for a communication channel describes more accurately the 
channel's ability to transmit information. By using physically faithful models for the 
channel outputs and the detection procedure, we obtain better communication rates 
than would be possible using a classical strategy. In this thesis, we are interested 
in the transmission of classical information, so we restrict our attention to the study 
of classical- quantum channels. These are channels with classical inputs and quantum 
outputs, and so the coding theorems we present will use classical encoding and quantum 
decoding. 

We study the asymptotic regime where many copies of the channel are used in 
parallel, and the uses are assumed to be independent. In this context, we can exploit 
information-theoretic techniques to calculate the maximum rates for error-free com- 
munication for any channel, given the statistics of the noise on that channel. These 
theoretical bounds can be used as a benchmark to evaluate the rates achieved by prac- 
tical communication protocols. 

Most of the results in this thesis consider classical-quantum channels with finite 
dimensional output systems, which are analogous to classical discrete memory less chan- 
nels. In the last chapter, we will show some applications of our results to a practical 
optical communication scenario, in which the information is encoded in continuous 
quantum degrees of freedom, which are analogous to classical channels with Gaussian 
noise. 
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Resume 



La theorie de rinformation multipartite etudie les problemes de communication 
avec plusieurs emetteurs, plusieurs recepteurs et des stations relais. L'objectif de cette 
these est d'etendre les idees centrales de la theorie de I'information classique a I'etude 
des canaux quantiques. Nous allons nous interesser aux scenarios de communication 
suivants: les canaux quantiques a acces multiples, les canaux quantiques a interference, 
les canaux quantiques de diffusion et les canaux quantiques a relais. Dans chacun 
des ces scenarios de communication, nous caracterisons les taux de communication 
realisables pour I'envoi d'information classique sur ces canaux quantiques. 

La modelisation quantiquc des canaux de communication est importante car elle 
fournit une representation plus precise de la capacite du canal a transmettre I'information 
En utilisant des modeles physiquement realistes pour les sorties du canal et la procedure 
de detection, nous obtenons de meilleurs taux de communication que ceux obtenus 
dans un modele classique. En effet, I'utilisation de mesures quantiques collectives sur 
I'ensemble des systemes physiques en sortie du canal permet une meilleure extraction 
d'information que des mesures independantes sur chaque sous-systeme. Nous avons 
choisi d'etudier les canaux a entree classique et sortie quantique qui constituent une 
abstraction utile pour I'etude de canaux quantiques generaux oii I'encodage est restreint 
au domaine classique. 

Nous etudions le regime asymptotique on de nombreuses copies de du canal sent 
utilisees en parallele, et les utilisations sont independantes. Dans ce contexte, il est 
possible de caractcriser les limites absolues sur la transmission d'information d'un canal, 
si on connait les statistiques du bruit sur ce canal. Ces resultats theoriques peuvent 
etre utilisees comme un point de repere pour evaluer la performance des protocoles de 
communication pratiques. 

Nous considerons surtout les canaux oii les sorties sont des systemes quantiques de 
dimension finie, analogues aux canaux classiques discrets. Le dernier chapitre presente 
des applications pratiques de nos resultats a la communication optique, ou systemes 
physiques auront des degres de liberte continus. Ce contexte est analogue aux canaux 
classiques avec bruit gaussien. 
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Chapter 1 
Introduction 



The central theme of this work is the transmission of information through noisy com- 
munication channels. The word information means different things to different people, 
so it is worthwhile to begin the discussion with a clear definition of the term. State- 
ments like "Canada has an information-based economy" suggest that information is 
some kind of commodity that can be shipped on trains for export like oil or lumber. 
In the world of digital electronics, the word information is used as a synonym for the 
word data as in "How much information can you store on your USB memory stick?" . 
In that context, most people would say that a 7MB mp3 file contains just as much 
information 7MB file full of zeros. 

In this work we will use the term information in the sense originally defined by 
Claude Shannon |Sha48] . Shannon realized that in order to study the problems of 
information storage and information transmission mathematically, we must step away 
from the semantics of the messages and focus on their statistics. Using the notions of 
entropy, conditional entropy and mutual information, we can quantify the information 
content of data sources and the information transmitting abilities of noisy communi- 
cation channels. 

We can arrive at an operational interpretation of the information content of a data 
source in terms of our ability to compress it. The more unpredictable the content of 
the data is, the more information it contains. Indeed, if we use WinZip to compress 
the mp3 file and the file full of zeros, we will see that the latter will result in a much 
smaller zip file, which is expected since a file full of zeros has less uncertainty and, by 
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1.2 Information theory 

extension, contains less information. 

We can similarly give an operational interpretation of the information carrying 
capacity of a noisy communication channel in terms of our ability to convert it into 
a noiseless channel. Channels with more noise have a smaller capacity for carrying 
information. Consider a channel which allows us to send data at the rate of 1 MB/sec 
on which half of the packets sent get lost due to the effects of noise on the channel. It 
is not true that the capacity of such a channel is 1 MB/sec, because we also have to 
account for the need to retransmit lost packets. In order to correctly characterize the 
information carrying capacity of a channel, we must consider the rate of the end-to- 
end protocol which converts many uses of the noisy channel into an effectively noiseless 
communication channel. 



1.1 Information theory 



Information theory studies models of communication 
which are amenable to mathematical analysis. In order 



rv. 7 

to model the effects of noise (i) in a point-to-point com- 

munication scenario, we represent the inputs and outputs Figure 1.1: A point-to- 
of the channel probabilistically. We describe the channel Point channel =e py|^(y|x). 
as a triple {X ,pY\x{y\x), 3^), where X is the set of possible 

symbols that the Transmitter (Tx) can send, 3^ is the set of possible outputs that the 
Receiver (Rx) can obtain and PY\x{y\x) is a conditional probability distribution describ- 



ing the channel's transition probabilities. This model is illustrated in Figure [LI where 



random variables are pictorially represented as small triangles ([>). For example, the 
noiseless binary channel is represented as the triple ({0, l},pY\x{y\x) = ^{.^1 y), {0, 1}). 
Using this model of the channel, it is possible to calculate the optimal communica- 
tion rates from Transmitter to Receiver in the limit of many independent uses of the 
channel |Sha48j . These theoretical results have wide-reaching applications in many ar- 
eas of communication engineering but also in other fields like cryptography, computer 
science, neuroscience and even economics. So long as a probabilistic model for the 
channel at hand is available, we can use this model and the techniques of information 
theory to arrive at precise mathematical statements about its suitability for a given 
communication task in the limit of many uses of the channel. 
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1.2 Network information theory 



Network information theory is the extension of Shannon's model of noisy channels to 
communication scenarios with multiple senders and multiple receivers |EGC80t ICT91t 
lEGKlOj . To model these channels probabilistically, we use multivariate conditional 
probability distributions. Some of the most important problems in network information 



theory are shown in Figure 1.2, and the relevant class of probability distributions is 
also indicated. 




{&) MAC =p{y\xi,X2) (b) BC = p(j/i,y2|a;) {c) IG = p{yi,y2\xi,X2) (d) RC = yja;, xi) 



Figure 1.2: Classical network information theory studies communication channels with 
multiple senders and multiple receivers. These include, among others, (a) multiple access 
channels (MACs), (b) broadcast channels (BCs), (c) interference channels (ICs), and (d) 
relay channels (RCs). 



Each of the above channels is a model for some practical communication scenario. 
In the multiple access channel, there are multiple transmitters trying to talk to a sin- 
gle base station, and we can describe the tradeoff between the communication rates 
that are achievable for the inbound communication links. The broadcast channel is the 
dual problem in which a single transmit antenna emits multiple information streams in- 
tended for different receivers. We can additionally have a common information stream 
intended for both receivers. Coding strategies for broadcast channels involve encodings 
that can "mix" the information streams to produce the transmit signal. Interference 
channels model situations where multiple independent transmissions are intended, but 
crosstalk occurs because the communication takes place in a shared medium. The re- 
lay channel is a multi-hop information network. The Relay is assumed to decode the 
message during one block of uses of the channel and re-transmit the information it has 
decoded during the next block. This allows the Receiver to collectively decode the in- 
formation from both the Transmitter and the Relay and achieve better communication 
rates than what would be possible with point-to-point codes. 
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1.3 Quantum channels 



Classical models are not adequate for the characterization of the information carrying 
capacity of communication systems in which the information carriers are quantum sys- 
tems. Such systems need not be exotic: in optical communication links, the carriers are 
photons, which are properly described by quantum electrodynamics and only approx- 
imately described by Maxwell's equations. A more general model for communication 
channels is one which takes into account the underlying laws of physics concerning the 
encoding, transmission and decoding of information using quantum systems. Quantum 
decoding based on collective measurements of all the channel outputs in parallel can be 
shown to achieve higher communication rates compared to classical decoding strategies 
in which the channel outputs are measured individually. 

Of particular interest are classical- quantum channel 

models, which model the sender's inputs as classical vari- r\ Kf^^^ 

111,., A Tx x^>—^ >( pS ) Rx 

ables and the receiver s outputs as quantum systems. A V_y 

classical-quantum channel (A", A/''^~^-^(x) =pf, "H-^) is fully Figure 1.3: A point- 

•cjuj-uc-j- 4- f J- 4-4-4- f B^ -4- J to-point classical-quantum 

specmed by the nnite set of output states jpf; | it produces , r t 

channel {pxj- 



for each of the possible inputs x E X. Figure |1.3| depicts 
a classical-quantum channel, in which the quantum out- 
put system is represented by a circle: O. Such channels form a useful abstraction 
for studying the transmission of classical data over quantum channels. The Holevo- 



Schumacher- Westmoreland (HSW) Theorem (see page 32) establishes the maximum 



achievable communication rates for classical-quantum channels |Hol98[ [SW97j . 



Note that a classical-quantum (c-q) channel corresponds to the use of a quantum- 
quantum (q-q) channel in which the sender is restricted to selecting from a finite set 
of signalling states. Any code construction for a. c-q channel can be augmented with 
an optimization over the choice of signal states to obtain a code for a q-q channel. For 
this reason, we restrict our study here to that of c-q channels. 

The study of quantum channels finds practical applications in optical communi- 
cations. Bosonic channels model the quantum aspects of optical communication links. 
It is known that optical receivers based on collective quantum measurements of the 
channel outputs outperform classical strategies, particularly in the low-photon-number 
regime |GGL"'"04[ IGuhllj IWGTL12j . In other words, quantum measurements are nec- 
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essary to achieve their ultimate information carrying capacity. In |GGL"'"04j it is also 
demonstrated that classical encoding is sufficient to achieve the Holevo capacity of the 
lossy bosonic channel, giving further motivation for the theoretical study of classical- 
quantum models. 



1.4 Research contributions 



This thesis presents a collection of results for problems in network information theory 
for classical-quantum channels. As we stated before, the results here easily extend to 



quantum-quantum channels. The problems considered are illustrated in Figure 1.4 



Txl 



Tx2 




Txl 



Tx2 




Rxl 



Rx2 




(a) QMAC ^ {pI^,^ } (b) QIC ^ {pf^^^^^ } 



(c) QBC^{pf^^^} 



(d) QRC s {p^f^} 



Figure 1.4: Network information theory can be extended to channels with quantum out- 
puts. We call these "classical-quantum channels," and consider the following communication 
scenarios: (a) multiple access channels (QMACs), (b) interference channels (QICs), (c) broad- 
cast channels (QBCs), and (d) relay channels (QRCs). 



Most of the results presented in this thesis have appeared in publication. The new 
results on the quantum multiple access channel and the quantum interference channel 
appeared in |FHS"'"12] . which is a collaboration between Omar Fawzi, Patrick Hayden, 
Pranab Sen, Mark M. Wilde and the present author. That paper has been accepted 
for publication in the IEEE Transactions on Information Theory. A more compact 
version of the same results was presented by the author at the 2011 Allerton confer- 
ence |FHS+llj . A follow -up paper on the bosonic quantum interference channel was 
presented by the author at the 2011 International Symposium on Information Theory, 
thanks to a collaboration with Saikat Guha and Mark M. Wilde |GSWllj . A further 
collaboration with Mark M. Wilde led to the publication of |SW12j . which describes 
two coding strategies for the quantum broadcast channel. Finally, a collaboration with 
Mark M. Wilde and Mai Vu led to the development of the coding strategy for the 
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quantum relay channel presented in |SWV12j . The last two papers have been accepted 
for presentation at the 2012 International Symposium on Information Theory. 

Our aim has been to present a comprehensive collection of the state-of-the-art 
of current knowledge in quantum network information theory analogous to the review 
paper by Cover and El Gamal jEGCSO] . Indeed, the current work contains the classical- 
quantum extension of many of the results presented in that paper. Towards this aim, 
we have chosen to include in the text the statement of several important results by 
others. These include a proof of the capacity theorems of the point-to-point c-q chan- 
nels different from the original ones due to Holevo, Schumacher and Westmoreland 
|Hol98t ISW97j and the capacity result for quantum multiple access channel, originally 
due to Winter |Win01j . We will also present an alternate achievability proof of the 
quantum Chong-Motani-Garg rate region for the QIC, which was originally proved by 
Sen |Senl2aj . 



1.5 Thesis overview 



Each of the communication problems covered in this thesis is presented in a separate 
chapter, and each chapter is organized in the same manner. The exposition in each 
chapter is roughly self-contained, but the ideas developed in Chapter |4] are of key 
importance to all other results in the thesis. Chapters [3] through [7] present results 
on classical-quantum {c-q) channels where the output systems are arbitrary quantum 
states in finite dimensional Hilbert spaces. This class of channels generalizes the class 
of classical discrete memoryless channels. The last chapter. Chapter |8| introduces the 
basic notions of quantum optics and studies bosonic quantum channels, for which the 
output system is a quantum system with continuous degrees of freedom. 

Necessary background material on the notion of a classical typical set and its quan- 
tum analogue, the quantum typical subspace, is presented in Chapter [2] A more de- 
tailed discussion about typicality is presented in the appendix. Appendix A. 1| concerns 



classical typical sets whereas Appendix |B.1| reviews the properties of quantum typical 
subspaces, and quantum typical projectors. Of particular importance are conditionally 
typical projectors, which are used throughout the proofs in this work. 

Our exploration of the classical-quantum world of communication channels begins 
in Chapter [3} where we discuss classical and classical-quantum models of point-to-point 
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communication. We will state and prove the capacity result for each class of channels: 



Shannon's classical channel coding theorem (Theorem 3.1 ) and the Holevo- Schumacher- 



Westmoreland theorem (Theorem 3.2) concerning the capacity of the classical-quantum 
channel. 



Chapter |4] presents results on the quantum multiple access channel (QMAC) and 
discusses the different coding strategies that can be employed. The capacity of the 



QMAC was established by Winter in [WinOl] (Theorem 4.1) using a successive de 



coding strategy. Our contribution to the quantum multiple access channel problem 



is Theorem 4.2, which shows that the two-sender simultaneous decoding is possible. 



This result and the proof techniques used therein will form key building blocks for the 



results in subsequent chapters. The proof of Theorem |4.2| is the result of longstanding 
collaboration within our research group. 

Chapter [5] will present results on quantum interference channels. These include the 
calculation of the capacity region for the quantum interference channel in two special 
cases and a description of the quantum Han-Kobayashi rate region |FHS"'"11[|FHS"'"1^ . 
In that chapter, we also provide an alternate proof of the achievability of the quantum 
Chong-Motani-Garg rate region, which was first established by Sen in |Senl2a] . This 
new proof is original to this thesis. 

Chapter |6] is dedicated to the quantum broadcast channel problem. We prove two 



theorems: the superposition coding inner bound (Theorem 6.1 ), which was first proved 



in |YHD11] using a different approach, and the Marton inner bound with no common 



message (Theorem 6.2). 



In Chapter [7| we will present Theorem 'JA_ which is a proof of the partial decode- 



and-forward inner bound for the quantum relay channel. The decode- and- forward and 
direct coding strategies for the quantum relay channel are also established, since they 



are special cases of the more general Theorem 7.1 



Chapter [8] discusses the free-space optical communication interference channel in 
the presence of background thermal noise. This is a model for the crosstalk between 
two optical communication links. This chapter demonstrates the practical aspect of 
the ideas developed in this thesis. 

We conclude with Chapter [9] wherein we state open problems and describe avenues 
for future research. 
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Chapter 2 
Background 



In this chapter we present all the necessary background material which is essential to 
the results presented in subsequent chapters. 

2.1 Notation 

We will denote the set {1,2, . . . ,n} as [1 : n] or with the shorthand [n]. A random 
variable X, defined over a finite set X, is associated with a probability distribution 
px{x) = Pr{X = a;}, where the lowercase x is used to denote particular values of 
the random variable. Furthermore, let V{X) denote the set of all probability mass 
functions on the finite set X . Conditional probability distributions will be denoted as 
PY\x{y\x) or simply py\x- 

In order to help distinguish between the classical systems (random variables) and 
the quantum systems in the equations, we use the following naming conventions. Clas- 
sical random variables will be denoted by letters near to the end of the alphabet (f/, 

Xi, X2) and denoted as small triangles, t>, in the diagrams of this thesis. The 
triangular shape was chosen in analogy to the 2-simplex = "^({1, 2, 3}). Quantum sys- 
tems will be named with letters near the beginning of the alphabet (A, B2) and 
represented by circles, O, in diagrams. The circular shape is chosen in analogy with 
the Bloch sphere |LS11] . 

Consider a communication scenario with one or more senders (female) and one or 
more receivers (male). In diagrams, a sender is denoted Tx (short for Transmitter) 
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and is associated with a random variable X. If there are multiple senders, then each 
of them will be referred to as Sender k and associated with a random variable X^- 
Receivers will be denoted as Rx 1, Rx 2 and each is associated with a different output 
of the channel. The outputs of a classical channel will be denoted as Yi, 12, and the 
outputs of a quantum channel will be denoted as p^^ , p^^ . 

The purpose of a communication protocol is to transfer bits of information from 
sender to receiver noiselessly. In this respect, the noiseless binary channel from sender 
to receiver is the standard unit resource for this task: 

{X = {Q,l},pY\x{y\x) = 5{x,y),y = {0A}) = [c^c], (2.1) 

where we have also defined the more compact notation [c — )■ c] . We will use [c — )■ c] to 
denote the communication resource of being able to send one bit of classical information 
from the sender to the receiver |DHW08] . The square brackets indicate that the re- 
source is noiseless. In order to describe multiuser communication scenarios, we extend 
this notation with superscripts indicating the sender and the receiver. Thus, in order 
to denote the noiseless classical communication of one bit from Sender k to Receiver z 
we will use the notation [c^ — c^]. The communication resource which corresponds to 
the sender being able to broadcast a message to Receiver 1 and Receiver 2 is denoted 
as [c — > c^c^]. All the coding theorems presented in this work are protocols for convert- 
ing many copies of some noisy channel resource into noiseless classical communication 
between a particular sender and a particular receiver as described above. 

Codebooks {x"(m)},„eAi are lookup tables for codewords representing a discrete 
set of messages Ai = {1, 2, 3, ... , |A^|} that could be transmitted. A communication 
rate R is a real number which describes our asymptotic ability to construct codes 
for a certain communication task. We will use the notation = 2"^, and Ai = 
{1,2,3,...,|A^|} = [1: 2"^], in which 2""^ should be interpreted to indicate [2"^J. 

Let = {v E R"' \ Vi > 0,Vz G [1 : n]} be the non-negative subset of IR"". We 
will denote a rate region as 7?. C [R^ and the boundaries of regions as dTZ. We denote 
points as P G R" and denote the convex hull of a set of points {Pi} as conv({Pj}). 
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2.2 Classical typicality 

We present here a number of properties of typical sequences |CT91] . 



2.2.1 Typical sequences 

Consider the random variable X with probability distribution px{x) defined on a finite 
set X . Denote by jA"! the cardinality of X . Let H{X) = H{px) = — '^^Pxi^) ^og2Px{x) 
be the Shannon entropy of px, and it is measured in units of bits. The binary entropy 
function is denoted -^2(^0) = -Polog2(po) - (1 -Po)log2(l - Po) = -^^2(^1), where 
Po = PxiO) and pi = 1 - Pq. 

Denote by x" a sequence X1X2 ■ ■ ■ Xn, where each Xi, i G [n] belongs to the finite 
alphabet X. To avoid confusion, we use i G [1 : n] to denote the index of a symbol 
X in the sequence x" and a G [1,2,. .. , \XW to denote the different symbols in the 
alphabet X. 

Define the probability distribution px^i^"') on X"^ to be the n-fold product of px'- 
Px"{x"') = YYi=iPxi^i)- "^^^ sequence x" is drawn from px^ if and only if each letter 
Xi is drawn independently from px- For any 6 > 0, define the set of entropy (5-typical 
sequences of length n as: 



(2.2) 



Typical sequences enjoy many useful properties |CT91j . For any e,6 > 0, and 
sufficiently large n, we have 

J2 ^1-^' (2-3) 

x"er/"'(x) 

2-n[H(X)+5] < p^„(a;n) < 2--lH(X)-S] y^n ^ rt\X) , (2.4) 

[1 _ e]2"[^W-^] <|r/"Hx)|< 2"[^W+^1. (2.5) 



Property (2.3) indicates that a sequence X"^ of random variables distributed ac- 
cording to px^ = n" Vx (identical and independently distributed) , is very likely to be 
typical, since all but e of the weight of the probability mass function is concentrated 
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Figure 2.1: The typical set. Property (2.3) implies that draws of a random sequence X"' 



PX" = YY^ Px are likely to fall inside the typical set T^^\x) C A"" with high probability. If 
draws from X" ~ 11" represented as points, then after many draws the typical set will 

become darker as the shaded region in the diagram. The probability mass density on T^^\x) 



is approximately uniform: it varies between 2 and 2 "[^(^) ^1 (Property (2.4)), 

and the size of the shaded area will be at most 2"[^(^)+'51 (Property ^^). The non-typical 



set, Af" \ t}^\x), will have at most e weight in it (Property (|2.3[) 



on the typical set, which follows from the law of large numbers. Property (2.4) follows 



from the definition of the typical set (2.2). The lower bound on the probability of the 



typical sequences from (2.4) can be used to obtain an upper bound on the size of the 



typical set in (2.5). Similarly the upper bound from (2.4) and equation (2.3) can be 



combined to give the lower bound on the typical set in (2.5). 



2.2.2 Conditional typicality 

Consider now the conditional probability distribution PY\x{y\x) associated with a 
communication channel. The induced joint input-output distribution is {X, Y) ~ 
Px{x)pY\x{y\x), when pxi^) is used as the input distribution. 

The conditional entropy H{Y\X) for this distribution is 

H{Y\X) = H{X, Y) - H{X) = Px{xa)H{Y\xa). (2.6) 

where H{Y\xa) = -T.yPY\xiy\xa)^ogpY\x{ 



We define the ^^-conditionally typical set T}^\y\x'^) C y"- to consist of all se- 
quences which are typically output when the input to the channel is x": 



I0gpyn|xn(l/"|X") 



n 



< S , (2.7) 
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Figure 2.2: Illustration of the conditionally typical set 7^ and the output-typical 

set 7^^"\r). The "density" of T^'^\y), the lightly shaded area, is at least 2-"[-f^(^)+^l, and 
the size of 7^^"\^) is at most 2"[^(^)+^l. The size of 7^^"\y|x"), the darker shaded region, 
is no greater than 2"'[-'^(^l"'^)+^l for an picked on average. 



with pyn|X"(y"|a^") = Y\d=iPy\x{yi\^i) ■ The definition in (2.7) can be rewritten as: 

2-"[^(^l^)+^l < pyn|^„(2/"|x") <2^"[^™-^l, Vi/" G r/"\r|x"), (2.8) 
for any sequence x". 



Suppose that a random input sequence ~ px^ = H^P^ passed through 
the channel Py"\x^- Then a conditionally typical sequence is likely to occur. More 
precisely, we have that for any e, 5 > 0, and sufficiently large n the statement is true 
under the expectation over the input sequence X": 



E 

i/"er/"'(y|X") 



X" 



j/"er/"\y|x") 



> 1 - e. 



(2.9) 



We also have the following bounds on the expected size of the conditionally typical set: 



[1 - e]2''mY\x)-s] < ^ 



2n[Ii{Y\X)+5] _ 



(2.10) 



13 



2.2 Classical typicality 

2.2.3 Output-typical set 



Consider the distribution over symbols y E y induced by the channel M = PY\x{y\x) 
whenever the input distribution is px{x): 

PYiv) = = ExAT. (2.11) 

X 



We define the output typical set as 



logpyn(y") 



n 



H{Y) 



< 6 



(2.12) 



where pyn = H^P^^- Note that the output-typical set is just a special case of the 



general typical set shown in (2.2). The terminology output-typical is introduced to 
help with the exposition. 



When the input sequences are chosen according to X" ~ p^n = H^P^; then 
output sequences are likely to be output-typical: 

E V pyn|xn(l/lX") > 1-e. (2.13) 



An illustration and an intuitive interpretation of (2.9), (2.10) and (2.13) is pre- 



sented in Figure 2.2 The expression in (2.9) for the property of the conditionally 
typical set T^'^'^iY\x^) is the analogue of the typical property (2.3) for T}'^\x). The 
interpretation is that the codewords of a random codebook are likely to produce output 
sequences that fall within their conditionally typical sets. This property will be used 
throughout this thesis to guarantee that the decoding strategies based on conditionally 



typical sets correctly recognize the channel outputs. On the other hand, (2.10) gives us 
both an upper bound and a lower bound on the size of the conditionally typical set for 



a random codebook. Finally, Property (2.13) tells us that the outputs of the channel 
which are not output-typical are not likely. 
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2.2.4 Joint typicality 



Consider now the joint probability distribution Pxy{x, y) G V{X, y). Let (X", F") be 
a pair of random variables distributed according to the product distribution H^P-^'i'- 

We define the jointly typical set J'/^^X, C A"" x 3^" to be the set of sequences 
that are typical with respect to the joint probabihty distribution pxr and with respect 
to the marginals px and py. 



J-f^(X,y) = <'(x",i/")GA'"x3^^ 



(2.14) 



.y-)ert\x.Y) 



A multi- variable sequence, therefore, is jointly typical if and only if all the sequences 
in the subsets of the variables are jointly typical. 

The probability that two random sequences drawn from the marginals X" ~ px 
and ~ n"Pi^ jointly typical can be bounded from above by 2'"^^^^'"^^'^^. This 
is straightforward to see from the definition in ( 2.14[ ) and the properties of typical sets. 



If (x", y") is such that G T^riX) and G Ts^'iY) then px"(a;") < 2^"[-f^(^)-^'l and 
PY^{y^) < 2~"t^*^^)~^'l. On the other hand, we know that the number of sequences that 
are typical according to the joint distribution is no larger than 2"[^("^^)+''"]. Combining 
these two observations we get: 

Px"(x")pyn(y") < |r/,r)(x,r)| 2-"[^(^)-^'i2-'^[^(^)-^'] 

^ 2n[H{XY)+5"]2-n[H{X)-5']2-n[H{Y)~5'] 

= 2-"[^(^'^)-^]. (2.15) 



Note that the parameter 5 = 2S' + S" is a function of our choice of typicality parameters 
for the typical sets. 



2.3 Introduction to quantum information 

The use of quantum systems for information processing tasks is no more mysterious 
than the use of digital technology for information processing. The use of an analog 
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to digital converter (ADC) to transform an analog signal to a digital representation 
and the use of a digital to analog converter (DAC) to transform from the digital world 
back into the analog world are similar to the state preparation and the measurement 
steps used in quantum information science. The digital world is sought after because of 
the computational, storage and communication benefits associated with manipulation 
of discrete systems instead of continuous signals. Similarly, there are benefits associ- 
ated with using the quantum world (Hilbert space) in certain computation problems 
|Sho94t ISho95] . The use of digital and quantum technology can therefore both be seen 
operationally as a black box process with information encoding, processing and readout 
steps. 

The focus of this thesis is the study of quantum aspects of communication which 
are relevant for classical communication tasks. In order to make the presentation 
more self-contained, we will present below a brief introduction to the subject which 
describes how quantum systems are represented, how information can be encoded and 
how information can be read out. 



2.3.1 Quantum states 



In order to describe the state of a quantum system B we use a density matrix p acting 
on a (i-dimensional complex vector space "H^ (Hilbert space). To be a density matrix, 
the operator has to be Hermitian, positive semidefinite and have unit trace. We 
denote the set of density matrices on a Hilbert space "H^ as Vil-L^). 



A common choice of basis for "H is the standard basis {|0),|1), — 1)}: 



|0) = 



Y 




o" 







1 

















, . . . , 



\d-l) 



(2.16) 



which is also known as the computational basis. 
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In two dimensions, another common basis is the Hadamard basis: 

|->-^|0)-^|l). (2.18) 



The eigen-decomposition of the density matrix gives us another choice of basis 
in which to represent the state. Any density matrix can be written in the form: 



P 



|ep;i) |ep;2) • • • \ep.d) 



Ap;l 

K,2 







••• A„. 



(ep;l| 
(ep;2| 

(ep;<i| 



(2.19) 



where the eigenvalues A^jj are all real and nonnegative. In our notation, column vectors 
are denoted as kets |ep;j) and the dual (Hermitian conjugate) of a ket is the hra: 
(ep;j| = |ep;i)^ (a row vector). We say that p^ is a pure state if it has only a single 
non-zero eigenvalue: Xp-i — 1, Xp-i = 0, Vi > 1. 

Because the density matrix is positive scmidcfinite and has unit trace {^2 - Xp-i = 1), 
we can identify the eigenvalues of p^ with a probability distribution: priy) = Xp-y. A 
density matrix, therefore, corresponds to the probability distribution pviv) over the 
subspaces: \ep-y){ep:y\. This property will be important when we want to define the 
typical subspace for the tensor product state: (p^)®** = p^^ • • • p 



Suppose that we have a two-party quantum state such that Alice has the 
subsystem A and Bob has the subsystem B. The state in Alice's lab is described by 
— Trs[p'^'^], where Tr^ denotes a partial trace over Bob's degrees of freedom. 

In order to describe the "distance" between two quantum states, we use the notion 
of trace distance. The trace distance between states a and pis ||(7 — p||i — Tr\a — p\, 
where \X\ — VJOX. Two states that are similar have trace distance close to zero, 
whereas states that are perfectly distinguishable have trace distance equal to two. 



Two quantum states can "substitute" for one another up to a penalty proportional 
to the trace distance between them: 
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Lemma 2.1. Let < p,a,A < I. Then 






Tr [Ap] < Tr [Aa] 4 


- Ilp-t^lli- 


(2.20) 



Proof. This follows from a variational characterization of trace distance as the distin- 
guishability of the states under an optimal measurement operator M: 

lip — crlL = 2 max Tr lM(p — cr)! 
" 0<M<I ^ ' 

> max Tr \M(p — a)] 

0<M<I ^ 

® 

> Tr[A(p-a)] 

> Tr [Ap] - Tr [Aa] . 

Equation ® follows since the operator A, < A < 1, is a particular choice of the 
measurement operator M. □ 



Most of the quantum systems considered in this document are finite dimensional, 
but it is worth noting that there are also quantum systems with continuous degrees of 
freedom which are represented in infinite dimensional Hilbert spaces. We will discuss 
the infinite dimensional case in Chapter |8j where we consider the quantum aspects of 
optical communication. 



2.3.2 Quantum channels 



By convention we will denote the input state as a (for sender) and the outputs of the 
channel as p (for receiver). A noiseless quantum channel is represented by a unitary 
operator U which acts on the input state a by conjugation to produce the output state 
p = UaU'^ . General quantum channels are represented by completely-positive trace- 
preserving (CPTP) maps N^~^^ , which accept input states in A and produce output 
states in B: = M^^^{a^). 

If the sender wishes to transmit some classical message m to the receiver using a 
quantum channel, her encoding procedure will consist of a classical-to-quantum encoder 
£: m ^ a^, to prepare a message state G V{l-L^) suitable as input for the channel. 
We call this the state preparation step. 
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If the sender's encoding is restricted to transmitting a finite set of orthogonal states 
{o'^}x<^Xi then we can consider the choice of the signal states {cr^} to be part of the 
channel. Thus we obtain a channel with classical inputs x & X and quantum outputs: 

= Af^^^{x) = Af^^^{cr^). A classical-quantum channel, Af^^^, is represented 
by the set of lA"! possible output states {p^ = M-^^^i^x)}, meaning that each classical 
input of X G leads to a different quantum output G 'D{'H^). 

2.3.3 Quantum measurement 

The decoding operations performed by the receivers correspond to quantum measure- 
ments on the outputs of the channel. A quantum measurement is a positive operator- 
valued measure (POVM) {Amj^^jf \m\} system B, the output of which we 
denote M' . The probability of outcome M' = m when the state p^ is measured is 
given by the Born rule: 

Pr{M' = m} = Tr [A^p-^] . (2.21) 

To be a valid POVM, the set of operators A^ must all be positive semidefinite 
and sum to the identity: A^ > 0, J2m ^"i = I- 

A quantum instrument {T^}"^^^ is a more general operation which consists of a 
collection of completely positive (CP) maps such that is trace preserving |DL70] . 

When applied to a quantum state a^^ the different elements are applied with probability 
Pk = Tr[Tfc(cr^)] resulting in different normalized outcomes pf = ^Tfc(cr^). 

2.3.4 Quantum information theory 

Many of the fundamental ideas of quantum information theory are analogous to those 
of classical information theory. For example, we quantify the information content of 
quantum systems using the notion of entropy. 

Definition 2.1 (von Neumann Entropy). Given the density matrix p^ G 'D('H^), the 
expression 

i7(A), = -Tr(p^logp^) (2.22) 
is known as the von Neumann entropy of the state p"^. 

Note that the symbol H is used for both classical and quantum entropy. The 
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von Neumann entropy of quantum state with spectral decomposition = J2i ^MiX^i 
is equal to the Shannon entropy of its eigenvalues. 

i/(A), = -Tr(p^logp^) =-^AaogA, = /7({A,}). (2.23) 



For bipartite states p we can also define the quantum conditional entropy 

H{A\B)p = H{AB)^ - H{B)„ (2.24) 

where H(B)p = — Tr (p^logp^) is the entropy of the reduced density matrix p^ = 
Tta {p^^) ■ In the same fashion we can also define the quantum mutual information 

I{A; B), = H{A)p + H{B)p - H{AB),, (2.25) 

and in the case of a tripartite system p^^'^ we define the conditional mutual information 

as 

I{A;B\C), = H{A\C)p + H{B\C)p-H{AB\C)p (2.26) 
= H{AC)p + H{BC)p-H{ABC)p-H{C)p. (2.27) 



It can be shown that I{A]B\C) is non negative for any tripartite state p^^^ . The 
formula I{A\ B\C) > can also be written in the form 

H{AC) + H{BC) > H{C) + H{ABC). (2.28) 

This inequality, originally proved in |LR73] . is called the strong subadditivity of von 
Neumann entropy and forms an important building block of quantum information 
theory. 

Consider the classical-quantum state p^^ given by: 
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The conditional entropy H{B\X) of this state is equal to: 

H{B\X) = Y,Px{x)H{p^,) = Y,Px{x)H{B),^. (2.30) 



2.4 Quantum typicality 

The notions of typical sequences and typical sets generalize to the quantum setting 
by virtue of the spectral theorem. Let HP be a rf^ dimensional Hilbert space and let 
G ViTi^) be the density matrix associated with a quantum state. We identify the 
eigenvalues of with the probability distribution pviv) = ^p;y and write the spectral 
decomposition as: 

y=l 

where \ep-y) is the eigenvector of p^ corresponding to eigenvalue pviy)- 

Define the set of 5-typical eigenvalue labels according to the eigenvalue distribution 

Py as 



For a given string = ?/i?/2 •••?/«••• 2/™ we define the corresponding eigenvector as 

|ep;yn) = \ep,y^) ® \ep;y^) ® " " " ® \t p;y^) , (2.33) 

where for each symbol t/j = 6 G {1, 2, . . . , ds) we select the b**^ eigenvector |ep;b). 
The typical subspace associated with the density matrix p^ is defined as 

Al, = span {|e,;,n) : G Tf^F)} . (2.34) 

The typical projector is defined as 

^P^.<5 ^ \'^p;y"){^p]y"\- (2.35) 

y^er}"\Y) 



logpyn(z/") 



<6}. (2.32) 



Note that the typical projector is linked twofold to the spectral decomposition of (2.31 ): 



the sequences y"" are selected according to py and the set of typical vectors are built 
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from tensor products of orthogonal eigenvectors \ep-y). 



Properties analogous to (2.3) - (2.5) hold. For any e,S > 0, and all sufficiently 
large n we have 



5^ 



(2.36) 
(2.37) 
(2.38) 



Equation (2.36) tells us that most of the support of the state p®" is within the 
typical subspace. The interpretation of (2.37) is that the eigenvalues of the state p^^ 



are bounded between 2 ^i^(^)p+^] and 2 ''I on the typical subspace A^^. 



Signal states Consider now a set of quantum states {pf^}, Xa & X. We perform a 
spectral decomposition of each pf^ to obtain 

dB 

Pxa = l]Pi'l^(l/k")|ep.,i2y)(ep.,;2/|^, (2.39) 

y=l 

where PY\x{y\xa) is the y^^ eigenvalue of pf^ and \ep^^-y) is the corresponding eigenvec- 
tor. 

We can think of {pf^} as a classical-quantum {c-q) channel where the input is 
some Xa & X and the output is the corresponding quantum state pf^. If the channel is 
memoryless, then for each input sequence x^ = xiX2 ■ ■ ■ x„ we have the corresponding 
tensor product output state: 



2.4.1 Quantum conditional typicality 

Conditionally typical projector Consider the ensemble {px{xa) , Pxa}- The choice 
of distributions induces the following classical-quantum state: 

P"""" = Mixal^'^pi- (2.41) 
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Figure 2.3: Illustration of a conditionally typical subspace for some sequence x", and the 
output-typical subspace. 



We define H{B\X)p = '^^aex Px{xa)H{px^) to be the conditional entropy of this 
state. Expressed in terms of the eigenvalues of the signal states, the conditional entropy 
becomes 

H{B\X)p = H{Y\X) = Y,Px{xa)H{Y\xa), (2.42) 
where H{Y\xa) = —J2yPY\x{y\xa)^ogpY\x{y\xa) is the entropy of the eigenvalue dis- 



tribution shown in (2.39). 



We define the x'^-conditionally typical projector as follows: 



n « 



(2.43) 



where the set of conditionally typical eigenvalues T}'^\y\x'^) consists of all sequences 
?/"■ which satisfy: 



with J9yn|X"(?/"|x") = lY^=iPY\x{yi\Xi). 



logpyn|xn(y"|a;") 



n 



H{Y\X) 



(2.44) 



The states |ep^„;yn) are built from tensor products of eigenvectors for the individual 
signal states: 



\^PxnWTi)^ 



(2.45) 



where the string = yiy2 ■ ■ .yi ■ ■ .yn varies over different choices of bases for T-i^ . For 
each symbol yi = h E {1, 2, . . . , ds} we select |ep^^;b): the b*'' eigenvector from the 
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eigenbasis of corresponding to the letter Xi — Xa & X . 

The following bound on the rank of the conditionally typical projector applies: 

Tr{n"s J < 2"[-^(^l^)''+'^l (2.46) 



2.5 Closing remarks 



In the next chapter, we will show how the properties of the typical sequences and typical 
subspaces can be used to construct coding theorems for classical and classical-quantum 
channels. 
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Point-to-point communication 



In this chapter we describe the point-to-point communication scenario in which there 



is a single sender and a single receiver. In Section |3.1[ we review Shannon's channel 
coding theorem and give the details of the achievability proof in order to introduce the 
idea of random coding in its simplest form. Our presentation is somewhat unorthodox 
since we use only the properties of the conditionally typical sets and not the jointly 
typical sets. Though, following this approach allows us to directly generahze our proof 
techniques to the quantum case. 



In Section 3.2.1 we will discuss the Holevo-Schumacher- Westmoreland (HSW) The- 
orem and show an achievability proof. We do so with the purpose of introducing im- 
portant background material on the construction of quantum decoding operators. We 
show how to construct a decoding POVM defined in terms of the conditionally typical 



projectors. Readers interested only in the essential parts should consult Lemma 3.1 



and Lemma 3.2, since they will be used throughout the remainder of the text. 



3.1 Classical channel coding 

The fundamental problem associated with communication channels is to calculate and 
formally prove their capacity for information transmission. We can think of the use of 
a channel A/" as a communication resource, of which we have n instances. Each use of 
the channel is assumed to be independent, and modelled by the conditional probability 
distribution PY\x{y\x), where x and y are elements from the finite sets X, y. This is 
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called the discrete memoryless setting. 



Our goal is to study the rate R at which the channel M can be converted into 
copies of the noiseless binary channel [c — )■ c] = 6{x,y), x,y E {0, 1}, which represents 
the canonical unit resource of communication. This conversion can be expressed as 
follows: 

n-J\f nR-[c^c]. (3.1) 



This equation describes a protocol in which n units of the noisy communication resource 
M are transformed into nR bits of noiseless transmission, and the protocol succeeds 
with probability (1 — e). Note that we allow the communication protocol to fail with 
probability e, but e is an arbitrarily small number for sufficiently large n. To prove 
that the rate R is achievable, one has to describe the coding strategy and prove that 
the probability of error for that strategy can be made arbitrarily small. Usually, the 



right hand side in equation (3.1) is measured as the number of different messages 
M. = {1,2,..., 2"^} = [1 : 2"^^] that can be transmitted using n uses of the channel. 
One can think of the nR individual bits of the message as being noiselessly transmitted 
to the receiver. The channel coding pipeline can then be described as follows: 



m — 
e M 




V 



M' 

e M 



Figure 3.1: Classical channel coding setup. The diagram shows the encoding, transmission 
and decoding steps of a communication protocol that uses n copies of the classical channel 

M = {x,VY\x{y\x),y). 



The probability of error when sending message m is defined as Peim) = Pr{M' ^ 
m}, where M' = I? o A^" o S{m) is the random variable associated with the output of 
the protocol. The average probability of error over all messages is 

This is the quantity we have to bound when we perform an error analysis of some 
coding protocol. 

Definition 3.1. An {n,R,e) coding protocol consists of a message set Ai, where 
\J\4\ = 2"^, an encoding map £ : Ai ^ X"' described by a codebook {x^{m)}meM, 
and a decoding map V : Ai such that the average probability of error is bounded 
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from above as Pg < e. 

A rate R is achievable if there exists an (n, R — 6,e) coding protocol for all e,6 > 
as n — )■ cxD. 



3.1.1 Channel capacity 

The capacity C of a channel is the maximum of the rates R that are achievable, and 
is established in Shannon's channel coding theorem. 

Theorem 3.1 (Channel capacity |Sha48t IFei54] ) . The communication capacity of a 

discrete memoryless channel {X ,pY\x{y\x),y) is given by 

C = maxI{X;Y), (3.3) 

Px 

where the optimization is taken over all possible input distributions px{x) ■ The mutual 
information is calculated on the induced joint probability distribution 

(X, Y) ~ pxrix, y) = pxix)pY\xiy\x). (3.4) 
The proof of a capacity theorem usually contains two parts: 

• A direct coding part that shows that for all e,6 > 0, there exists a codebook 
S{m) = {x"(m)} of rate R = C — 6 and a decoding map V with average proba- 
bility of error Pe < £• 

• A converse part that shows that the rate C is the maximum rate possible. A 
converse theorem establishes that the probability of error for a coding protocol 
(n, C -\- S,e) is bounded away from zero (weak converse), or that the probability 
of error goes exponentially to 1 (strong converse). 



Proof. We give an overview of the achievability proof of Theorem 3.1 in order to in 



troduce key concepts, which will be used in the other proofs in this thesis. 

We use a random codebook with 2"-^ = codewords x"' G ^Y" generated inde- 
pendently from the product distribution px^{x") = Yl"^ Px{xi). When the sender wants 
to send the message m G A^, she will input the m^^ codeword, which we will denote 
as x"(m). Let denote the resulting output of the channel. The distribution on the 
output symbols induced by the input distribution is priy) = J2xPy\x(y\^)p(^)' 
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define the set of output-typical sequences {Y) according to tlie distribution py. For 
any sequence x", denote tlie set of conditionally typical output sequences 7^^"''(F|x"). 

Given the output of the channel y", the receiver will use the following algorithm: 

1. If ^7;^"^(r), then an error is declared. 

2. Return m if is an element of the conditionally typical set 7^*'"''(y |x"(m)). 
Report an error if no match or multiple matches are found. 

We now define the three types of errors that may occur in the protocol when the 
message m is being sent. 

(EO): The event that the channel output is not output-typical: [Y"^ ^ Ts'^^iY)}. 

(El): The event that the channel output sequence is not in the conditionally typical 
set {Y^ ^ 7^*-"''(y |x"(m))}, which corresponds to the message m. 

(E2): The event that is output-typical and it falls in the conditionally typical set 
for another message: 

|yn ^ rt\Y)} n [ U {r" G rt\YW\m')),m' ^ m} J. (3.5) 



We can bound the probability of all three events when a random codebook is used, 
that is, we will take the expectation over the random choices of the symbols for each 
codeword. We define the expectation of an event as the expectation of the associated 
indicated random variable. 



The bound Ex" (EO) < e follows from (2.13). The crucial observation for the proof 



is to use the symmetry of the code construction: if the codewords for all the messages 
are constructed identically, then it is sufficient to analyze the probability of error for 



any one fixed message. We obtain a bound Ex" (El) < e from (2.9) 



In order to bound the probability of error event (E2), we will use the classical 



packing lemma, Lemma |A.1| in Appendix |A.2[ Using the packing lemma with f/ = 0, 
we obtain a bound on the probability that the conditionally typical sets for different 
messages will overlap. We can thus bound the expectation of the probability of error 
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event (E2) as follows: 

E Pr{(E2)} < \M\ 2-"[^(^;^)-'51. 

We can now use the union bound to bound the overall probability of error for our 
code as follows: 

E {pe} = E Pr{(EO) U (El) U (E2)} 

< E Pr{(EO)} + E Pr{(El)} + E Pr{(E2)} 

X" X" X" 

< 6 + e + |A^|2-"[^(^^^)-^l 

Thus, in the limit of many uses of the channel, we have: 

E{Pe} < e, (3.6) 

X" 

provided the rate R < I{X; Y) - 25. 

The last step is called derandomization. If the expected probability of error of 
a random codebook can be bounded as above, then there must exist a particular 
codebook with Pe < e', which completes the proof. □ 

Note that it is possible to use an expurgation step and throw out the worse half 
of the codewords in order to convert the bound on the average probability of error p^. 
into a bound on the maximum probability of error pf^^ = max^Pei^) |CT91j . 



3.2 Quantum communication channels 

A quantum channel {'H^,Af'^^^, Ti^) is described as a com- 
pletely positive trace-preserving map M^^^ which takes a hf^^^^ 



quantum system in state a G V{'H ) as input and out- 
puts a quantum system G Vil-L^). Figure 3.2 shows an Figure 3.2: A point- 



example of such a channel. In recent years, the techniques to-pomt quantum channel 
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of classical information theory have been extended to the 

study of quantum channels. For a review of the subject see |Willl] . 

In addition to the standard problem of classical transmission of information (de- 
noted [c — )■ c]), for quantum channels we can study the transmission of quantum 
information (denoted [q ^ q\)- If pre-shared entanglement between Transmitter and 
Receiver is available, it can be used in order to improve the communication rates using 
an entanglement- assisted protocol. There are multiple communication tasks and differ- 
ent capacities associated with each task for any given quantum channel M |BSST99] . 
Some of the possible communication tasks, along with their associated capacities are: 

• Classical data capacity: C{N) 

• Quantum data capacity: Q{Af) 

• Entanglement-assisted classical data capacity: Ce-a(A/') 

• Entanglement-assisted quantum data capacity: Qe-a(A/') 

The latter two are actually equivalent up to a factor of 2, because we can use the 
superdense coding and quantum teleportation protocols to convert between them in the 
presence of free entanglement |BW92l lBBC+93j . 

In the context of quantum information theory, pre-shared quantum entanglement 
between sender and receiver must be recognized as a communication resource. We 
denote this resource [qq] and must take into account the rates at which it is consumed 
or generated as part of a communication protocol |DHW08] . It is interesting to note 
that shared randomness (denoted [cc]), which is the classical equivalent of shared en- 
tanglement, does not increase the capacity of point-to-point classical channels. 



Classical-quantum channels 

In the previous section we introduced some of the main 

communication problems of quantum information theory. J\f^^^ /'~\ 

The focus of this thesis will be the study of classical com- V_y 
munication ([c — ?■ c]) over quantum channels, with no en- Figure 3.3: A point-to- 
tanglement assistance. For this purpose, we will use the P°™* '^"'^ channel {px]- 
classical- quantum (c-q) channel model, which corresponds 

to the use of a quantum channel where the Sender is restricted to sending a finite 
set of signal states {<Jx}xex- If "we consider the choice of the signal states {cr^} 
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to be part of the channel, we obtain a channel with classical inputs x G and 
quantum outputs: M^^^{x) = Af^^^{a^). Note that a classical-quantum channel 
{X ,Af^^^ (x) = , Ti^) is fully specified by the finite set of output states {pf } it 
produces for each of the possible inputs x & X. This channel model is a useful ab- 
straction for studying the transmission of classical data over quantum channels. Any 
code construction for a c-q channel can be augmented with an optimization over the 
choice of signal states {a^}xex to obtain a code for a quantum channel. The Holevo- 
Schumacher- Westmoreland Theorem establishes the classical capacity of the classical- 
quantum channel |Hol98l ISW97j . The strong converse was later proved in |ON99j . 



3.2.1 Classical-quantum channel coding 

The quantum channel coding problem for a point-to-point classical-quantum channel 
{X,M^^^{x)=p^ , Ti^) is studied in the following setting. 



m — 
e M 



Let x"(m) = X1X2 ■ ■ ■ x„ G A"" be the codeword which is input to the channel 
when we want to send message m. The output of the channel will be the n-fold tensor 
product state: 

Af^-{x-{m)) ^ pf:;^) ^ pf^^(^) ® p5(„) ® . . . ® pf;(„). (3.7) 




Figure 3.4: HSW coding setup. 



To extract the classical information encoded into this state, we must perform a 
quantum measurement. The most general quantum measurement is described by a 
positive operator-valued measure (POVM) {-^m}meM system S". To be a valid 

POVM, the set {A^} of operators should all be positive semidefinite and sum to 
the identity: A„ > 0, Y^m^rn = I- 

In the context of our coding strategy, the decoding measurement aims to distin- 



guish the \M. \ possible states of the form (3.7). The advantage of the quantum coding 
paradigm is that it allows for joint measurements on all the outputs of the channel, 
which is more powerful than measuring the systems individually. 
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We define the average probability of error for the end-to-end protocol as 

Pe = 7X77 X^Tr{(/ — Af„(,„)) pfn(^)} , (3.8) 
' ' m 

where the operator ^/ — A;^,r(m) j corresponds to the complement of the correct decod- 
ing outcome. 

Definition 3.2. An (n, i?, e) classical-quantum coding protocol consists of a message 
set A^, where \M.\ = 2^^, an encoding map S : Ai ^ X"' described by a codebook 
{x"(m)}m6X, and a decoding measurement (POVM) {Axn(^rn)}meM such that the av- 
erage probability of error is bounded from above as < e. 

Theorem 3.2 (HSW Theorem |Hol98|, ISW97] ) . The classical communication capacity 
of a classical- quantum channel {X , ,Tt^) is given by: 

C{J\f) = max I {X -,3)9 (3.9) 

Px 

where the optimization is taken over all possible input distributions px, and where 
entropic quantities are calculated with respect to the following state: 

= Y,Px{x) |a;)(x|^ ® pf . (3.10) 



The classical-quantum state 6 is the state with respect to which we will calcu- 
late mutual information quantities. We call this state the code state and it extends 
the classical joint probability distribution induced by a channel, when the input dis- 
tribution Px is used to construct the codebook: px{x)pY\x{,y\x). In the case of the 
classical-quantum channel, the outputs are quantum systems. Information quantities 
taken with respect to classical-quantum states are called "Holevo" quantities in hon- 
our of Alexander Holevo who was first to recognize the importance of this expression 
by proving that it is an upper bound to the accessible information of an ensemble 
|Hol73t IHol79j . Holevo quantities are expressed as a difference of two entropic terms: 

/(X; B)e ^ H{B)e - H{B\X)e ^ h(Y,px{x)p^\ - Y,Px{x)H{p^^). (3.11) 

Holevo quantities are in some sense partially classical, since the entropies are with 
respect to quantum systems, but the conditioning is classical. 
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Quantum decoding 



When devising coding strategies for classical-quantum channels, the main obstacle 
to overcome is the construction of a decoding POVM that correctly identifies the 
messages. Using the properties of quantum typical subspaces we can construct a set of 
positive operators {Pm}meM which, analogously to the classical conditionally typical 
indicator functions, are good at detecting (Tr[Pm p^] > 1 — e) and distinguishing 
{Tr[Pm Pm'^m] < e) the output states produced by each message. We can construct a 
valid POVM by normalizing these operators: 



(3.12) 



so that we will have Xlm^m ~ ^- This is known as the square root measurement or 
the pretty good measurement |Hol98l [5W97j . 




The achievability proof of Theorem 3.2 is based on the properties of typical sub- 



spaces and the square root measurement. We construct a set of unnormalized positive 
operators 

= Hp n,n(^) Up, (3.13) 

where Ilx"(m) = " ^ ^ ^ is the conditionally typical projector that corresponds to the 
input sequence x^{m) and lip = ^p^-n^ is the output-typical projector for the average 
output state p = ^xPx{x)pl^ . The operator "sandwich" inequation (3.13) corresponds 



directly to the decoding criteria used in the classical coding theorem. We require the 
state to be in the output-typical subspace and inside the conditionally typical subspace 



for the correct codeword x"(m). The decoding POVM is then constructed as in (3.12) 



By using the properties of the typical projectors, we can show that the probability 
of error of this coding scheme vanishes provided R < I{X; B) — 6. An effort has been 
made to present the proofs of the classical and quantum coding theorems in a similar 
fashion in order to highlight similarities in the reasoning. 
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3.3 Proof of HSW Theorem 

In this section we give the details of the POVM construction and the error analysis for 
the decoder used by the receiver in the HSW Theorem. 



Recall the classical-quantum state (3.10), with respect to which our code is con- 
structed: 

^^^ = ^p;,(x)|x)(a;|^®pf. (3.14) 

X 

For each input sequence x", there is a corresponding (5-conditionally typical pro- 
jector: n^n = nj^^"„ 5. 

Define also the average output state p = J^xP^i^) Px ' corresponding 
average-output-typical projector lip = H^i^n ^. 

The Receiver constructs a decoding POVM {Am}meM by starting from the pro- 
jector sandwich: 

= Up n,n(^) n^, (3.15) 

and normalizing the operators: 

= (^^') ^"^ (?^') ■ ^^'^^^ 

The error analysis of a square root measurement is greatly simplified by using the 
Hayashi-Nagaoka operator inequality. 

Lemma 3.1 (Hayashi-Nagaoka |HN03j ). If S and T are operators such that < T 
and < S < I, then 

I - {S + T)-'^ S{S + T)-'^ < 2{I-S)+AT. (3.17) 
If we let S = Pm and T = X^mVm above inequality we obtain 

J-A^ < 2{I-PJ + AZm'^„,Pn.', (3.18) 

which corresponds to the decomposition of the error outcome (/ — A^) into two con- 
tributions: 
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I. The probability that the correct detector does not "chck": (/ — Pm)- This cor- 
responds to the error events (EO) and (El) in the classical coding theorem. 

II. The probability that a wrong detector "clicks": J2m'^m^'m'- This corresponds to 
the error event (E2) in the classical case. 

We will show that the average probability of error 

Pe = JYTi ^ "^^{{^ ~ ) Px"{m) } 5 

will be small provided the rate R < I{X; B) - 5 = H{B) - H{B\X) - 6. The bound 
follows from the following properties of typical projectors: 

Hpp®" Up < 2-"[^(^)~'^lnp, (3.20) 

and reasoning analogous to that used in the classical coding theorem. Note that by 
the symmetry of both the codebook construction and the decoder we can study the 
error analysis for a fixed message m. 

Consider the probability of error when the message m is sent, and let us apply the 



Hayashi-Nagaoka operator inequality (Lemma 3.1) to split the error into two terms: 



Pe = Tr[(/-A^")pfr(^)] 

< 2 Tr [(/ - ) pf„VJ + 4 5^ Tr [Pj Pf-V)] • (3-21) 



(I) ^ ^ 

(II) 



We bound the expectation of the average probability of error by bounding the 
individual terms. 

We now state two useful results, which we need to bound the first error term. 



First, recall the inequality from Lemma 2.1 which states that: 



Tr[Ap] < Tr[A(T] + ||p-(t||i, (3.22) 
holds for all operators such that < p, a, A < /. 
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The second ingredient is the gentle measurement lemma. 



Lemma 3.2 (Gentle operator lemma for ensembles |Win99 ] ) . Let {p{x) , px} he an 
ensemble and let p = YlxPi-'^) P^- V operator A, where < A < I , has high overlap 
with the average state, Tr [ Ap] > 1 — e, then the subnormalized state \/ApxVA is close 



in trace distance to the original state px on average: Ex <, VXpxVA — px r < Si/e, 

n V 1 J 



We bound the expectation over the code randomness for the first term in (3.21) 
as follows: 



E(I)= ETr[(j-Pr) pf:(J 

= 1- E<Tr[n^n(^) li-pp^^f^^^Iip] 

® r "I 

I rri FtT -B^ 1 I I I I TT B"^ I I I 



— 1—E Tr[n3,n(„) pfn(^)] + E ||nppf„(^)np — pf„(„J| 

I 1- E Tr[n,n(^) pfr^^J +2v^ 



(D 

< 1 - (1 -e) + 2v/^ 



+ 2v^. 



The inequality (D follows from equation (3.22 ). The inequality @ follows from Lemma 3.2 



and the property of the average output state Tr[np p®"'] > 1 — e. The inequality @ 
follows from: Ex" Tr [nxn(m)Px"{m)] > 1 - e- 

The crucial Holevo information-dependent bound on the expectation of the second 



term in (3.21) can be obtained by using the quantum packing lemma. The quantum 



packing lemma (Lemma B.l) given in Appendix B.2, provides a bound on the amount 



of overlap between the conditionally typical subspaces for the codewords in our code 



construction and is analogous to the classical packing lemma (Lemma A.l), which 



we used to prove the classical channel coding theorem. Note that Lemma B.l is less 
general than the quantum packing lemmas which appear in |HDW08] and |Willlj . 

The overall probability of error is thus bounded as 



E Pe 
X" 



< 2(e + 2y^)+4(2-"[^(^^^)-25-R]) 



(3.23) 
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and if we choose R < I{X; B) — 35, the probabihty of error is bounded from above by 
e in the hmit n — t- oo. 



Example 3.1 (Point-to-point channel). Consider the classical-quantum channel M = 
({0, 1}, , C^), which takes a classical bit as input and outputs a qubit (a two-dimensional 
quantum system). Suppose the channel map is the following: 



O^Po= |0)(0| 



1 




1 ^Pi = i+x+i 



1 1 

2 2 

1 1 

2 2 



(3.24) 



We calculate the channel capacity for three different measurement strategies: two 
classical strategies where the channel outputs are measured independently, and a quan- 
tum strategy that uses collective measurements on blocks of n channel outputs. Because 
the input is binary, it is possible to plot the achievable rates for all input distributions 



Px- See Figure 3.5 for a plot of the achievable rates for these three strategies. 



a) Basic classical decoding: A classical strategy for this channel corresponds to 
the channel outputs being individually measured in the computational basis: 



Ao = |0)(0|, Ai = |l)(l| 



A^" 



A, 



yi 



A. 



S/2 



A, 



yn- 



(3.25) 



Such a communication model for the channel is classical since we have Tr [Ay" pf^] = 
pyn|^n(?/"|x"). More specifically, pyn|x"(z/"k") = n"'Py|x(?/il^»)' where p^|^^(?/|x) is a 
classical Z-channel with transition probability pz = Py^^-^{0\1) = Tr[Ao|+)(+|] = 0.5. 



The capacity of the classical Z-channel is given by: 



C^'^^Af) = max if((l - j9o)(l -p,)) - il-po)H{pz 

0<po<l 



(3.26) 



where we parametrize in terms of po = px{0)- For this model, the capacity achieving 
input distribution has po = 0.6 and the capacity is C*^"^ = H2{0.2) — 0.4 ^ 0.3219. 

b) Aligned classical decoding: A better classical model is to use a "rotated" quan- 
tum measurement such that the measurement operators are symmetrically aligned with 
the channel outputs. The measurement directions —tt/8 and 7r/4 + tt/S are symmet- 
ric around the output states |0) and |+). Define the notation c^g = cos(7r/8) and 
s^g = sin(7r/8). The measurement along the —tt/8 and 7r/4 + tt/S directions corre- 
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sponds to the following POVM operators: 



^0 = (c^slO) - s^8|l))(c^8(0| - s^^{l\) = 

Al = {C^s\+) - S7Ts\-))iCns{+\ - SnsH 



{|0>,|1>} 



7^8 



{l+>,|->} 

where the matrix representations are expressed in the basis indicated in subscript. 



Using this measurement on channel outputs induces a classical channel Py^^x 



with transition probabilities 



718' 



(3.27) 



which corresponds to a binary symmetric channel (BSC) with crossover probability 
Pe = = sin^(7r/8) and success probability ps = c^g. The capacity of this BSC is 
given by: 



C(^)(Ar) = 1-H{ps) = 1 -/J(cos^(7r/8)) ^ 0.3991. 



(3.28) 



c) Holevo limit: The HSW Theorem tells us the ultimate capacity of this channel 
is given by 



C^^\Af) =max_ 

Px 



In our case, the capacity is achieved using the uniform input distribution. The capacity 
for this channel using a quantum measurment is therefore: 



C^^\Ar) = i/2(cos^(vr/8)) ^ 0.6009. 



(3.30) 



In general, a collective measurement on blocks of n outputs of the channel are 
required to achieve the capacity. This means that the POVM operators {A;^^} cannot 
be written as a tensor product of measurement operators on the individual output 
systems. The channel capacity can be achieved using the random coding approach and 
the square root measurement based on conditionally typical projectors as shown in the 



proof of Theorem 3.2 
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Figure 3.5: Plot of the achievable rates for the point-to-point channel given by the 
map — )• |0)(0|^,1 — )• l+X+l^ under three models. The horizontal axis corresponds to the 
parameter po = Px(0) of the input distribution. The first model treats each output of the 
channel as a classical bit Y^""^ S {0, 1} corresponding to the output of a measurement in 
the computational basis: {Ay"'*}j,g|o,i} = {|0)(0|, |1)(1|}. The mutual information I{X;Y'^"-^) 
for all input distributions px is plotted as a dashed line. Under this model, the channel N 
corresponds to a classical Z-channel. A better approach is to use a symmetric measurement 
with output denoted as Y^^\ which corresponds to a classical binary symmetric channel. The 
mutual information I{X;Y'^'^^) is plotted as a dot-dashed line. The best coding strategy is 
to use block measurements. The Holevo quantity H(^YlxPxi^)Px) ~ YlxPxi^)^(Px) for all 
input distributions is plotted as a solid line. The capacity of the channel under each model 
is given by the maximum of each function curve: C^'^\j\f) w 0.3219, C^^^M) ~ 0.3991, 
and C^^\M) = H2{cos'^ {tt / 8)) ^ 0.6009. For this particular channel the quantum decoding 
strategy leads to a 50% improvement in the achievable communication rates relative to the 
best classical strategy. 



a) Computational basis ciassicai measurement 

^ ^ b) Symmetric classical measurement 
c) Hoievo capacity 
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3.4 Discussion 



This chapter introduced the key concepts of the classical and quantum channel coding 



paradigms. The situation considered in Example 3.1 serves as an illustration of the 
potential benefits that exist for modelling communication channels using quantum 
mechanics. 

The key take-away from this chapter is that collective measurements on blocks of 
channel outputs are necessary in order to achieve the ultimate capacity of classical- 
quantum communication channels, and that classical strategies which measure the 
channel outputs individually are suboptimal. The increased capacity is perhaps the 
most notable difference that exists between the classical and classical-quantum paradigms 
for communication jGam] . 

In the remainder of this thesis, we will study multiuser classical-quantum com- 
munication models and see various coding strategies, measurement constructions and 
error analysis techniques which are necessary in order to prove coding theorems. 
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The multiple access channel is a communication model for situations in which multiple 
senders are trying to transmit information to a single receiver. To fully solve the 
multiple access channel problem is to characterize all possible transmission rates for 
the senders which are decodable by the receiver. We will see that there is a natural 
tradeoff between the rates of the senders; the louder that one of the senders "speaks," 
the more difficult it will be for the receiver to "hear" the other senders. 



4.1 Introduction 



The classical multiple access channel TV'"^^^^^^ is a triple 
(Afi X A'2,A/'(xi,X2) = PY\XiX2{y\xi,X2),y), where and 
are the input alphabets for the two senders, y is the output 
alphabet and PY\XiX2{y\^ij^2) is a conditional probability 
distribution which describes the channel behaviour. 



Txl 



Tx2 




Our task is to characterize the communication rates ^ . , . , 

i igure 4.1: A classical 

(i?i,i?2) that are achievable from Sender 1 to the receiver multiple access channel, 
and from Sender 2 to the receiver. 

Example 4.1. Consider a situation in which two senders use laser light pulses to 
communicate to a distant receiver equipped with an optical instrument and a pho- 
todetector. In each time instant. Sender 1 can choose to send either a weak pulse 
of light or a strong pulse: Xi = {-,-}. Sender 2 similarly has two possible inputs 



41 



4-1 Introduction 




Figure 4.2: A real- world multiple access channel A/i. 



X2 = {-, -}■ The receiver measures the hght intensity coming into the telescope, and 
we model his reading as the following output space y = {-, -, — }. The output sig- 
nal is the sum of the incoming signals: Y = Xi + X2. We have PY\XiX2{'\'y ") = 
Py\XiX2{-\ - ■) = Py\XiX2{-\; -) = 1 and PYix^x^i—] --) = !• 

The rate pair (i?i,i?2) = (1;0) is achievable if we force Sender 2 to always send a 
constant input. The resulting channel between Sender 1 and the receiver is a noiseless 
binary channel. The rate (0, 1) is similarly achievable if we fix Sender I's input. A 
natural question is to ask what other rates are achievable for this communication 
channel. 

Note that the model used to describe the above communication scenario is very 
crude and serves only as a first approximation, which we use to illustrate the basic 



ideas of multiple access communication. In Section [4. 1.2[ we will consider more general 
models for multiple access channels, which allow the channel outputs to be quantum 
systems. In Chapter |8| we will refine the model further by taking into account certain 
aspects of quantum optics. 



4.1.1 Review of classical results 

The multiple access channel is one of the first multiuser communications problems 
ever considered |Sha61j . It is also one of the rare problems in network information 
theory where a full capacity result is known, i.e., the best known achievable rate region 
matches a proven outer bound. The multiple access channel plays an important role 
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as a building block for other network communication scenarios. 

The capacity region of the classical discrete memoryless multiple access channel 
(DM- MAC) was established by Ahlswede |Ahl71l IAhl74aj and Liao |Lia72j . Con- 
sider the classical multiple access channel with two senders described by TV = {Xi x 
X2,Py\XiX2j y)- The capacity region for this channel is given by 

Ri < I{Xi;Y\X2) 1 
R2 < /(X2;F|Xi) I, 
R1 + R2 < I{X,X2;Y) J 

where pxi G V{Xi), ^ "^(-^2) and the mutual information quantities are taken 
with respect to the joint input-output distribution 

PX^X2Y{.Xi,X2,y) =PxAxi)PX2i.X2)PY\X^X2i.y\Xl,X2)- (4.1) 

Note that the input distribution is chosen to be a product distribution pxiPx2J which 
reflects the assumption that the two senders are spatially separated and act indepen- 
dently. We can calculate the exact capacity region of any multiple access channel by 
evaluating the mutual information expressions for all possible input distributions and 
taking the union. 



CMAc{^f)= U {{Ri,R2)e 

PXi,PX2 



Example 4.1 (continued). The capacity region for the multiple access channel Afi 



described in Example 4.1 is given by: 



CmaM)= <iRi,R2)eR'', 




(4.2) 



To see how the rate pair (1,0.5) can be achieved consider an encoding strategy where 
each sender generates codebooks according to the uniform probability distribution and 
the receiver decodes the messages from Sender 2 first, followed by the messages from 
Sender 1. The effective channel from Sender 2 to the receiver when the input of 
Sender 1 is unknown corresponds to a symmetric binary erasure channel with erasure 
probability 1. This is because when the receiver's output is "■" or " — " there is no 
ambiguity about what was sent. The output "-" could arise in two different ways, 
so we treat it as an erasure. The capacity of this channel is 0.5 bits per channel use 
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\CT9U Example 14.3.3]. Assuming the receiver correctly decodes the codewords from 
Sender 2, the resulting channel from Sender 1 to the receiver is a binary noiseless 
channel which has capacity one. To achieve the rate pair (0.5, 1) we must generate 
codebooks at the appropriate rates and use the opposite decoding order. The capacity 
region is illustrated in the following figure. 




Figure 4.3: The capacity region of the adder channel. 

The above example illustrates the key aspect of the multiple access channel prob- 
lem: the trade off between the communication rates of the senders. 



4.1.2 Quantum multiple access channels 



The communication model used to evaluate the capacity in Example 4.1 is classical. 
We modelled the detection of light intensity in a classical way and ignored details of 
the quantum measurement process. 

The capacity result of Ahlswede and Liao is therefore a result which depends on 
the classical model which we used. Better communication rates might be possible if 
we choose to model the quantum degrees of freedom in the communication channel. 



In Example 3.1, we saw how the quantum analysis of the detection aspects of the 
communication protocol can lead to improved communication rates for point-to-point 
channels. In this chapter, we pursue the study of quantum decoding strategies in the 
multiple access setting. 
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A classical-quantum multiple access channel is defined 
as the most general map with two classical inputs and one 
quantum output: 



'X1X2- 



Txl 



Tx2 




Rx 



Figure 4.4: A quan- 
tum multiple access chan- 
nel with two senders. The 

Our intent is to quantify the communication rates that output of the channel are 
are possible for classical communication from each of the conditional quantum states 
two senders to the receiver. The main difference with the ' ^^'^^ 

classical case is that the decoding operation we will use is a 
quantum measurement (POVM). We have to find the rate 

region for pairs (i?i,i?2) such that the following interconversion can be achieved: 



n 



J^XiX2- 



(l-e) 



nRi ■ [c^ c] + rai?2 ■ [c^ c] 



(4.3) 



The above expression states that n instances of the channel can be used to carry nRi 
classical bits from Sender 1 to the receiver (denoted [c^ — > c]) and ni?2 bits from 
Sender 2 to the receiver (denoted [c^ —> c]). The communication protocol succeeds 
with probability (1 — e) for any e > and sufficiently large n. 



The problem of classical communication over a classical-quantum multiple-access 
channel was solved by Winter |Win01j . He provided single- letter formulas for the 
capacity region, which can be computed as an optimization over the choice of input 
distributions for the senders. We will discuss Winter's result and proof techniques in 
Section 14. 2[ 



Note that there exist other quantum multiple access communication scenarios that 
can be considered. The bosonic multiple access channel was studied in |Yen05bj . The 
transmission of quantum information over a quantum multiple access channel was 
considered in |YDH05t lYarOSt lYHDOSj . The quantum multiple access problem has 
also been considered in the entanglement-assisted setting |HDW08] IXWll] . In this 
chapter, as in the rest of the thesis, we restrict our attention to the problem of classical 
communication over classical-quantum channels. 
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4.1.3 Information processing task 

To show that a certain rate pair {Ri,R2) is achievable we must construct an end-to- 
end coding scheme that the two senders and the receiver can employ to communicate 
with each other. In this section we specify precisely the different steps involved in the 
transmission process. 

Sender 1 will send a message mi chosen from the message setAli = {l,2,...,|Ali|} 
where = 2"^^ Sender 2 similarly chooses a message m2 from a message set 

M.2 = {1, 2, . . . , IAI2I} where \M.2\ — 2'"^^. Senders 1 and 2 encode their messages as 
codewords x"(mi) e and X2{m2) e which are then input to the channel. 

The output of the channel is an n-fold tensor product state of the form: 



In order to recover the messages mi and m2, the receiver performs a positive 
operator valued measure (POVM) {Ami^mjImieXi m2eX2 '^^ output of the channel 

We denote the measurement outputs as M[ and M2. An error occurs whenever the 
receiver measurement outcomes differ from the messages that were sent. The overall 
probability of error for message pair (mi, 777.2) is 

Pe(mi,m2) = Pr {(M(,M^) 7^ (mi, m2)} 

= Tr (/ — Ami,m2) Pa;5(mi)a;5(m2)] ' 

where the measurement operator (/ — A^^^^j) represents the complement of the correct 
decoding outcome. 

Definition 4.1. An (77, i?i, i?2! e) code for the multiple access channel consists of two 
codebooks {x^i{'>Tii)}mieMi ^-nd {.7:2 ("^2)}m2eA42) and a decoding POVM {Ami,m2}:'>^H ^ 
A^i,m2 G A^2, such that the average probability of error is bounded from above 
by e: 

Pe ^ \m1m2\ ^ ^e(mi,m2)<6. (4.5) 

mi,m2 

A rate pair {Ri, R2) is achievable if there exists an (ri, Ri — S,R2 — S, e) quantum 
multiple access channel code for all e,S > and sufficiently large n. The capacity region 
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CuAci-^) is the closure of the set of all achievable rates. 



4.1.4 Chapter overview 



Suppose we have a two-sender classical-quantum multiple access channel and the two 
messages mi and m2 were sent. This chapter studies the different decoding strategies 
that can be used by the receiver in order to decode the messages. 

The technique used by Winter to prove the achievability of the rates in the ca- 
pacity region of the quantum multiple access channel is called successive decoding. In 
this approach, the receiver can achieve one of the corner points of the rate region by 
decoding the messages in the order "mi — )■ m2|mi". In doing so, the best possible 
rate R2 is achieved, because the receiver will have the side information of mi, and by 
extensions x"(mi), when decoding the message m2. This approach is also referred to as 
successive cancellation for channels with continuous variable inputs and additive white 
Gaussian noise (Gaussian channels) where the first decoded signal can be subtracted 
from the received signal. The other corner point can be achieved by decoding in the 
opposite order "m2 — )■ mi|m2". These codes can be combined with time-sharing and 
resource wasting to achieve all other points in the rate region. We will discuss this 



strategy in further detail in Section 4.2 below 



Another approach is to use simultaneous decoding which requires no time-sharing. 
We denote the simultaneous decoding of the messages mi and m2 as "(mi,m2)". As 
far as the QMAC problem is concerned the two approaches yield equivalent achievable 
rate regions. However, if the QMAC code is to be used as part of a larger protocol 
(like a code for the interference channel for example) then the simultaneous decoding 
approach is much more powerful. 



The main contribution in this chapter is Theorem 4^ in Section 4^, which shows 
that simultaneous decoding for the classical-quantum multiple access channel with two 
senders is possible. This result and the techniques developed for its proof will form the 
key building blocks for the subsequent chapters in this thesis. We will also comment 
on the difficulties in extending the simultaneous decoding approach to more than two 



senders (Conjecture 4.1). In Section 4.4, we will briefly discuss a third coding strategy 



for the QMAC called rate- splitting. 
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4.2 Successive decoding 

Winter found a single-letter formula for the capacity of the classical-quantum multiple 
access channel with M senders |Win01j . We state the result here for two senders. 

Theorem 4.1 (Theorem 10 in |Win01] ). The capacity region for the classical- quantum 
multiple access channel {Xi x X2, p^^ ^^,71^) is given by 



Cmac= |J{(^1'^2) e R'+l Eqns. (^-Q } 



(4.6) 



i?i < I{X,-B\X2)e, 
R2 < I{X2;B\Xi)e, 
R1 + R2 < I{X,X2;B)e, 



(4.7) 
(4.8) 
(4.9) 



where the information quantities are taken with respect to the classical- quantum state: 



9 



X1X2B 



3:2 • 



(4.10) 



Xl,X2 



Figure 4.5: The rates achievable by successive decoding correspond to the dominant vertices 
of the rate region ap and f3p. Rates in between these points can be achieved by time-sharing 
between the strategies for the two corners. 

For a given choice of input probability distribution p = pxi,Px2, the achiev- 
able rate region, TZ{^/,p), has the form of a pentagon bounded by the three in- 
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equalities in equations (4.7)-(4.9) and two rate positivity conditions. The two domi- 
nant vertices of this rate region have coordinates ap = {I{Xi; B)g, I{X2; B\Xi)g) and 
I3p = {I{Xi] B\X2)e, I{X2; B)g) and correspond to two ahernate successive decoding 
strategies. The portion of the hne Ri + R2 = /(X1X2; B)0 which hes in between the 
points ttp and /?„ will be referred to as the dominant facet. 



In order to show achievability of the entire rate region, Winter proved that each 
of the corner points of the region is achievable. By the use of time-sharing we can 
achieve any point on the dominant facet of the region, and we can use resource wasting 
to achieve all the points on the interior of the region. It follows that the entire rate 
region is achievable. We show some of the details of Winter's proof below. 



Proof sketch. We will use a random coding approach for the codebook construction and 
point-to-point decoding measurements based on the conditionally typical projectors. 



Fix the input distribution p = Pxi (3^1)^X2(2^2) and choose the rates so that they 
correspond to the rate point ap-. 

Ri = I{X,;B)e-5, R2 = I{X2; B\X,)e - 5. (4.11) 



Codebook construction: Randomly and independently generate 2"^ sequences 

n 

a;"(mi), nii G [l : 2"^^], according to HP^il^i*)- Similarly generate randomly and 

n 

independently the codebook {x2{rn2)}, m2 G [l : 2"-^^] according to Yl Px2ix2i)- 



Decoding: When the message pair (mi, 77^2) is sent, the output of the channel will be 
Px'^(mi) x"(mo)- Let U": I be the conditionally typical projector for that state. 

' IV i-n 2v ^/ Hx^ (nil), (7712)^ 

In order to define the other typical projectors necessary for the decoding, we define the 
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following expectations of the output state: 

Px'iHmi) = ^PX^ix^) Px^irm),x^ = ( PXa (A^) Pxi,(mi),/. 

= E {Px^inn),Xs} , 



P®" = P^i ) P^S (^2 ) Px^ = (8) I 5Z (P) Pr,, 

= E { Pxv-,xj} \ ■ 

The state Px^mi) corresponds to the receiver's output if he treats the codewords of 
Sender 2 as noise to be averaged over. The state p®" corresponds to the average output 
state for a random code constructed according to PxtPxo- Let r = Ilf" c 

be the conditionally typical projector for Pxi{mi) and let 11^ = ^ be the typical 
projector for the state p®". 

To achieve the rates of ap, the receiver will decode the messages in the order 
"mi — )• m2\mi" using a successive decoding procedure. The first step is to use a 
quantum instrument |T^^} which acts as follows on any state defined on 5": 



mi 



The POVM operators {A^J are constructed using the typical projector sandwich 

m m . m, (4.13) 

and normalized using the square root measurement approach in order to satisfy A^^ > 
0, J2nii = ^- The purpose of the quantum instrument is to extract the message 
mi and store it in the register Mi, but also leave behind a system in 5'" which can be 
processed further. 

An error analysis similar to that of the HSW theorem shows that the quantum 
instrument |T^^} will correctly decode the message mi with high probability. This is 
because we chose the rate for the mi codebook to be Ri = I{Xi; B)g — 6. Furthermore, 



it can be shown using the gentle operator lemma for ensembles (Lemma 3.2), that the 



state which remains in the system S'" is negligibly disturbed in the process. 
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The receiver will then perform a second measurement to recover the message m2. 
The second measurement is a POVM {A^^|^^} constructed from the projectors 

and appropriately normalized. Note that this measurement is chosen conditionally on 
the codeword X"(mi) that Sender 1 input to the channel. This is because, when the 
correct message mi is decoded in the first step, the receiver can infer the codeword 
which Sender 1 input to the channel. Thus, after the first step, the effective channel 
from Sender 2 to the receiver is 

(xr,x^)^(xr,pf:,„), (4.15) 

where X" is a random variable distributed according to riiLi^*-''^!- This is a setting 
in which the quantum packing lemma can be applied. By substituting f/" = X" 



and X" = Xg into Lemma B.l, we conclude that if we choose the rate to be R2 



/(X2; B\Xi)q — 6, then the message m2 will be decoded correctly with high probability. 

The rate point Pp corresponds to the alternate decode ordering where the receiver 
decodes the message m2 first and mi second. All other rate pairs in the region can 
be obtained from the corner points ap and Pp by using time-sharing and resource 
wasting. □ 



Note that one of the key ingredients in the proof was the use of Lemma 3^, which 
guarantees that the act of decoding mi does not disturb the state too much. This step 
of our quantum decoding procedure may be counterintuitive at a first glance, since 
quantum mechanical measurements are usually described as processes in which the 
quantum system is disturbed. Any retrieval of data from a quantum system inevitably 
disturbs the state of the system, so the second measurement, which the receiver per- 
forms on the system i?'", may fail if the first measurement has disturbed the state too 
much. The gentle measurement lemma guarantees that very little information distur- 
bance to the state occurs when there is one measurement outcome that is very likely. 
When the state of the receiver is pfn .^.n, we can be almost certain that the outcome of 
the quantum instrument {T^^} is going to be mi. Therefore, this process leaves the 
state in S'" only slightly disturbed. 



51 



4-3 Simultaneous decoding 



The proof technique in Theorem 4A generahzes to the case of the M-sender MAC, 
which has Ml dominant vertices. Each vertex corresponds to one permutation of the 
decode ordering. 



4.3 Simultaneous decoding 



Another approach for achieving the capacity of the multiple access channel, which does 
not use time-sharing, is simultaneous decoding. In the classical version of this decoding 
strategy, the receiver will report (mi, 7722) if he finds a unique pair of codewords X"(mi) 
and X2 (m2) which are jointly typical with the output of the channel Y^: 

(Xr(mi), X2"(m2),F") G j}-\Xi, X2,Y). (4.16) 

Assuming the messages mi and m2 are sent, we categorize the different kinds of wrong 
message decode errors that may occur. 



error 


Ml 


M2 


( 
( 

( 


El 
E2 
E12 


) 
) 

) 


* 

mi 
* 


m2 
* 
* 



(4.17) 



The * in the above table denotes any message other than the one which was sent. The 
analysis of the classical simultaneous decoder uses the properties of the jointly typical 
sequences and the randomness in the codebooks. Recall that a multi-variable sequence 
is jointly typical if and only if all the sequences in the subsets of the variables are 
jointly typical. Thus, the condition (Xi"(mi), X2"(m2),F") G 7;^"^(Xi, X2, F) implies 
that: 



{X^{mi),Y^)er}''\XuY), (4.18) 
{X^{m2),Y-)er}-\X2,Y), (4.19) 

y"G7;("n^)- (4-20) 

Starting from these conditions, it is straightforward to bound the probability of the 
different decoding error events using the properties of the jointly typical sequences 
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|E(;Kin] . 

In the quantum case, we can similarly identify three different error terms, the prob- 
abilities of which can be bounded by using the properties of the conditionally typical 
projectors. If we can construct a quantum measurement operator that "contains" all 
the typical projectors so that we can obtain the appropriate averages of the output 
state in the error analysis, then we would have a proof that simultaneous decoding is 
possible. 



If only things were so simple! The construction of a simultaneous decoding POVM 
turns out to be a difficult problem. Despite being built out of the same typical projec- 
tors, the operator constructed according to 

A^, ™, oc m m n" m m , (4.21) 

is different from the operator 

A' „ oc m n" m m , (4.22) 

because the different typical projectors do not commute in general. In fact, there is 
very little we can say about the relationship between the subspaces spanned by the 
two averaged typical projectors: 11^ „ and IT^ „ . This is a problem because, for 

r- J Px'^(?ni) Px^(m2) 

one of the error terms in the analysis, we would like to have on the "outside" 

' Px^(m2) 



as in (4.21) so that we can use Property 2.46 of typical projectors to obtain a factor 



2nH{B\X2) ^ For another error term, we want to be on the outside as in (|4.22|) 



in order to be able to do the averaging in the alternate order to obtain a term of the 
form 2"^(-^l^i). Thus it would seem, and originally it seemed so to my colleagues and 
me, that the construction of a simultaneous decoding POVM for which we can bound 
the probability of all error events might be a difficult task. 



Quantum simultaneous decoding actually is possible, and this is what we will 
show in this section for the case of the multiple access channel with two senders. Our 
proof techniques do not generalize readily to quantum multiple access channels with 
more than two independent senders. At the end of this section we will formulate Con- 



jecture 4.1 regarding the existence of a simultaneous decoder for three-sender multiple 



access channels, which will be required for the proof of Theorem 5.3 in the next chapter. 
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Q,5 1 1,5 2 2.5 3 3.5 



Figure 4.6: Simultaneous decoding strategy. Simultaneous decoding of the two messages is 
more powerful than successive decoding, because it allows us to achieve any rate pair {Ri, R2) 
of the capacity region without the need for time-sharing. 



Theorem 4.2 (Two-sender quantum simultaneous decoding). Let {Xi x X2, p^^ .^^y'H^) 
be a quantum multiple access channel with two senders and a single receiver, and let 
p = PX1PX2 be a choice for the input code distribution. Let {X]'{mi)}mii^{i,...,\Mi\} 
and {-^2 ('^2)}m2e{i,...,|X2|} random codebooks generated according to the prod- 
uct distributions px^ and Pxj- There exists a simultaneous decoding POVM 
{Ami,m2}mieXi m2GX2' ^^^^ cxpcctcd avcragc probability of error bounded from above 
by e for all e,6 > and sufficiently large n, provided the rates Ri,R2 satisfy the in- 
equalities 



Ri < I{Xi;B\X2)e, 
R2 < I{X2;B\X^)e, 
R1 + R2 < I{X,X2;B)e, 



(4.23) 
(4.24) 
(4.25) 



where the state 0^^^'^^ is defined in (4.10). 



The main difference between the coding strategy employed by Winter in the proof 



of Theorem 4.1 and Theorem 4.2 above is that the latter does not require the use of 
time-sharing. Using the simultaneous decoding approach we can achieve any of the 
rates in the QMAC capacity region using a single codebook, whereas time-sharing 
requires us to switch between the two codebooks for the vertices. This distinction 
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is minor in the context of the multiple access channel problem, but it will become 
important in situations where there are multiple receivers as in the compound multiple 
access channel and the interference channel. Note that Sen gave an alternate proof of 



Theorem 4.2 using a different approach |Senl2a] . 



Proof of Theorem \4-^ The proof proceeds by random coding arguments using the 
properties of projectors onto the typical subspaces of the output states and the square 
root measurement. 

Consider some choice p = (3^1)^X2(2^2) for the input distributions. 

Codebook construction: Randomly and independently generate 2"^^ sequences 

n 

Xi{mi), nil £ [1 : 2"-^^], according to nP^i(^i«)- Similarly, generate randomly and 

i=l 

n 

independently the codebook {x2(m2)}, m2 G [l : 2^^^~^, according to Yl Px2{^2i)- 

i=l 

POVM construction: In order to lighten the notation, the channel output will be 
denoted with the shorthand Pmi,m2 = Pxi{mi),x^{m2)y when the inputs to the channel 
are x^(mi) and x^{m2). Let U"^^^^^ = ^p,u^^^-,^,n^^^^,s be the conditionally typical 
projector for that state. Consider the following averaged output states: 

Pxi = *^^2) Px,,x2 , (4.26) 

^2 

= X^PXi(2;i)pxi,x2, (4.27) 



''X2 

XI 



p 



X^Pxi(a;i)px2(a;2)Pxi,x2- (4.28) 



Xl,X2 



Let nj^ = ^ _5 be the conditionally typical projector for the tensor product state 



= Px^{mi) defined by (4.26) for n uses of the channel. Let IIJ^^ = ^,5 be the 

conditionally typical projector for the tensor product state pm2 = Px^{m2) defined by 
(4.27) and finally let 11^,5 be the typical projector for the state p®" defined by (4.28). 



The detection POVM {Ami,m2} has the following form: 



. rri j^ ,1712 
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where 

P — TT'^ TT" TT" TT" TT"' (A 9Q\ 

mi,m2 — '■^p,S -'■-'■mi ^^mi,m2 '■^mi '■^p,Si {^.^VJ 

is a positive operator which consists of three typical projectors "sandwiched" together. 
Observe that the layers of the sandwich go from the more general ones on the outside 
to the more specific ones on the inside. Observe also that the conditionally typical 
projector is not included. 

The average error probability of the code is given by: 

Pe = 1^ I '^^ ~ A^i.ma) Pmi.ma] • (4.30) 

' ' ' ^ ' mi ,m2 



The first step in our error analysis is to make a substitution of the output state 
Pmi,m2 with a smoothed version- 



Pmi, 



1712 



'-^m2r"ii,"i'2^^m2- 



(4.31) 



We do this to ensure that we will have the operator II^^ inside the trace when we 
perform the averaging. The term smoothing refers to the fact that we are now coding 
for a different channel which has all of the Il^^'^typical subspace removed, i.e., we 
remove the "spikes" (the large eigenvalues). 

We can use the inequality 



Tr[Ap] < Tr[Aa] + \\p-(t\\^ 



(4.32) 



from Lemma 2.1, which holds for all operators such that < p, cr, A < /, in order to 
bound the smoothing penalty which we incur as a result of the substitution. 



After the substitution step (4.30) and the use of (4.32), we obtain the following 



bound on the probability of error: 



P, 



<- 



1 



|A^i||A^2| 



E 



mi ,m2 



Tr[(J Ami,m2) Pmi,m2] "I" ||Pmi,m2 Pmi,m2l|l 



(4.33) 
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The next step is to use the Hayashi-Nagaoka operator inequahty |HN03] (Lemma 3.1 ) 
I-{S + T)~^ S{S + Ty'^ <2{I - S)+ AT. 
Choosing S = Pmi.ma, T = E(m;,m'2)^(mi,m2) ^'«'i.™2' ^PP^^ Operator 



inequahty to bound the average error probabihty of the first term in (4.33) as: 



Pe < 



\Mi\\M 



— y 



mi ,r?i2 



2Tr[(/ Pm\,m2) Prrii 



m2\ 



(4.34) 



~l~ 4 ^ ^ Tr l^f^'^^j^^p^j^j^j] ~l~ ||Pmi,m2 Pmi,m2l|l 

The three terms in the summation have an intuitive interpretation. The first term 
corresponds to the case when the output state is non-typical, the second term describes 
the probabihty of a wrong message being decoded, and the third term accounts for the 
smoothing penalty which we have to pay for using a code designed for the channel 
Pmi,m2 on the channel Pmi,m2- 

We apply a random coding argument to bound the expectation of the average error 



probability in (4.34). We compute the expected value of the error terms with respect to 



the random choice of codebook: {X"(mi)}, (^2)}. Recall that in our shorthand 
notation, the codewords are not indicated. Thus when we say Exj'.xy Pmx,m2i we really 
mean Exj'.xj Pxf (ml),XJ(m2)■ 



A bound on the first term in (4.34) follows from the following argument: 



[E Trl/'j,^. rn^Pmi.mol 

n yn 
1 '^2 

— r TrTrr" tt" tt"' tt" tt" n tt" 1 

— E Trln^ Pm 

— E Pm-^ 77^,2 ~ Pmi,m2 1 
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> 1 



(4.35) 



The first inequality follows from (4.32) (Lemma 2.1) applied tliree times. Tlie second 



inequality follows from Lemma 3.2 and the properties of the conditionally typical pro- 



jectors: (B.40), (B.41) and (B.42) given in Appendix B.l The last inequality follows 



from equation (B.39). 



The same reasoning is used to obtain a bound the expectation of the smoothing- 



penalty (the third term in (4.34)) 



^X^ ,X^\\Pmi,m2 - Pmi ,m2 1 1 1 — ^Xf ,Xy 1 1 P™! ,^2 ~ Pmi ,m2 1 1 1 

< 2v^. 



(4.36) 



The main part of the error analysis consists of obtaining a bound on the second 



term in (4.34). This term corresponds to the probability that a wrong message pair 
is decoded by the receiver. We split this term into three parts, each representing a 
different type of decoding error: 

J2 nPm',,m'2~Pmurr.2] = 
(m'j,m0^ (mi ,7712) 

= TT[Rm[,ni2pmum2] (El) 

,inLPm,i ,m2 

(E2) 

m^^m2 

(E12) 

m'l^mi ,m'2^m2 



We will bound each of these terms in turn. 



Bound on (El) : The expectation over the random choice of codebook for the error 



term (El) is as follows: 
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E {(|ET|}= E { V TT[P^'^,m,Pm„m,]] 



® 



Y] E <Tr E {Pm[,m2} E {Pmi.ma} ^ 



E E Tr 



E E Tr 



E E {pmi,mjn: 



n 

"12 



12 

Equation ® follows because the codewords for m'^ and mi are independent. Equality 
comes from the definition of the averaged code state = Px'^{m2)- The inequality 
follows from the bound 



TT" n TT" < 0-nlH{B\X2)-S]Y\n 
^^m.2rm2^^m.2 — ^ -^-^mo' 



We focus our attention on the expression inside the trace: 



Tr [P^;,^, n::,J = Tr 
^Tr 

< Tr 



TT- TT" TT" TT'^ TT- TT" 

p,(5 m'j m'j^,m2 m'j p,(5 rn,2 

TT" TT" TT" TT" TT" TT" 

^^m'^ ^^p,& ^^m.2 -^^p,5 ^^m'l ^^m[,m2 



TT" 



In the first step we substituted the definition of Pmi,m2 from equation (4.29). Equality 
® follows from the cyclicity of trace. Inequality © follows from 



(4.37) 
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Next, we obtain the following bound on the expected probability of the term (El ) 



T-rn 



E{(lET|}<2-"[^™-'^]y E (Tr 

n ' ^ X^ X^ V 

< \Mi\ 2-"[-f(^i;-s|^2)-25]_ 



(4.38) 



Inequality ® follows from the bound 



on the rank of a conditionally typical projector. 



Bound on (E2) : We employ a different argument to bound the probability of the 
second error term (|E2|) based on the following fact 



T-rn ^ r)n[H{B\XiX2)+&]Yin B jin 

mi,m2 — mi,m2rmi,m2 mi,m2 



cyn[H{B\XiX2)+S] B n" W 

^ ' rm\,m2 m\,m2\ rm\,m2 



^ cyn[H(B\XiX2)+8] B 



m2 ' 



(4.39) 



which we refer to as the projector trick |GLM12] . The first inequality is the standard 
lower bound on the eigenvalues of expressed as an operator upper bound on 

the projector IIJ^^ The equality follows because the state and its typical projector 
commute. The last inequality follows from < IIJ^^ — ^■ 



We now proceed to bound the expectation of the error term (E2). 



E |(|E2|)|= E <^ V Tr [P^,,^.p„,,^,] 



mLj^m2 



E E Tr 



mLi^m2 



E \ Pmi,mi, \ E \Pmi,m2\ 



5^ E (Tr [e {Uls Til, Uls] E {~Pm„m2} 



m2^m2 
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'"12 7^ '"12 



E 



Tr 



T-rn 



E 



TT" TT" TT'^ 

mi mi ,m'2 mi 



]ulsE{pmi,m,} 



We focus our attention on the first expectation inside the trace: 



^J^^li^ll,m',^li } < 2"[^(^l^^^^)+^] ^J^^liPmi,n.',^l. } 

mi ^,^rmi,m2j mi 



_ r,n[H{B\XiX2)+S]Y[n - rrn 
® 2nlHiB\XiX2)+5]2-n[HiB\Xi)-S]jjn 



mi 



9-n[/(X2;B|Xi)-25]TTn 
^ ■'■-'■mi- 



In inequahty ® we used the projector trick from (4.39). Inequahty (D follows from the 



properties of the conditionally typical projector 11^ . 



Substituting back into the expression for the error bound, we obtain: 



^4]®} < 2-"[^(^^^^i^^)-^^]^Tr[n^^^^^n^,,p^„„,] 



2-nlI{X2;B\Xi)-25] ST^ Tr TTT" TT" TT" TT" n TT" 1 

^ ^'^[^^p/^mi^^p,S^^m2Pmi,m2^^m2\ 



m'2^m2 



r)~n[I{X2;B\Xi)-25] ST^ Tr TTT" TT" TT" TT" TT" n 1 
^ ^'^\_^^m2^^p,9-^mi^^p,d^^m2Pmi,m2\ 

m'2^m2 



® 



< 2-n\l(X2:B\Xi)-2^] ^ Tr[p^, 
m^^m2 

<- 2-n[/{X2;i?|Xi)-2<5]|_y^^|_ 



m2\ 



(4.40) 



Inequality ® follows from an argument analogous to (4.37). 



Bound on (E12) : We use a slightly different argument in order to bound the 
probability of the third error term: 



E IdETll = E 



' ^ ' m'^f^mi ,m'2f^m2 



Tr l^-Pm'^ ^TTij Prrti ,m2 ] 
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® 



T E Tr 



E {Pm'm'} E {Pmi.ma} 



o^'^»v,„ 2 L 1 J J 



< 



E E Tr 



.^1 



E l-Pm' m'} E {Pma} 

1 '^2 ^2 



= 5^ lY E 



E [^h ^k,m'2 } 

^mi,m'2^m2^ ^ 



® 2-"[^^('B)-'5]2"[^^('B|^i'^2)+5] ^ ]_ 



(4.41) 



Equality ® follows from the independence of the codewords. To obtain equality ® we 
take the X" expectation over the state. Inequality ® follows from IIJ^^ Pm2 nj^^ = 
■\/Pm2 nj^2 VPm2 — Pm2- Inequality ® is obtained by using the cyclicity of trace to sur- 
round the state p®" by its typical projectors and then using the property Il^,p'^"'IT^^ < 
2-n[_ff(B)-<5]-Qn^ of the average output-typical projector. Inequality ® follows from 
n^, < n^, < /. Finally, inequality ® follows from the bound on the rank 
of the conditionally typical projector. 



Combining the bounds from equations (4.35), (4.38), (4.40), (4.41) and the smooth- 



ing penalty from (4.36), we get the following bound on the expectation of the average 
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X[",x: 

+ 4 



error probability: 

|_y^^| 2-^[nXi;B\X2)-2S] _j_ 2-n[/(^2;B|Xi)-25] 

Thus, we can choose the message sets sizes to be \Aii\ = 2"[^i~^'^l, and |A^2| = 2"[^2-3<5]^ 
the expectation of the average error probabihty vanishes whenever the rates Ri and 
i?2 obey the inequahties: 

Ri-5 < IiXr,B\X2), 
R2-6 <I{X2;B\Xi), 
Ri + R2 - 46 < I iXiX2; B) . 

If the probabihty of error of a random code vanishes, then there must exist a particular 
code with vanishing average error probability, and given that 5 > is an arbitrarily 
small number, the bounds in the statement of the theorem follow. □ 



We now state a corollary regarding the "coded time-sharing" approach to the 
MAC problem |HK81[ lEGKlO] . The main idea is to introduce an auxiliary ran- 
dom variable Q distributed according to PQ^q) and use the probability distribution 
PQ{Q)Pxi\Q{xi\q)px2\Q{x2) for the codebook construction. First we generate a ran- 
dom sequence g" ~ YYiPQili)^ then pick the codeword sequences and X2 ac- 
cording to the distributions Pxj'lQ" (x" |g") = YY^=iPXi\Q{xii\qi) and Px^lQ^ix^lq"-) = 
ULiPx2\Qix2ih)- 
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Corollary 4.1 (Coded time-sharing for QMAC). Suppose that the rates Ri and R2 
satisfy the following inequalities: 

R,<I{Xi;B\X2Q)s, (4.42) 

R2<I{X2;B\XiQ),, (4.43) 

Ri + R2<I{XiX2;B\Q)g, (4.44) 

where the entropies are with respect to a state 0'^^^^^^ of the following form: 

J2 Pq(i)Px,\q{xi \q) Vx,\Q{x2\q) \q){qf ®\xi){xi ® \x2){x2\''' ®pf,,,, • (4.45) 

Then there exists a corresponding simultaneous decoding POVM {Ami, 771,2} such that 
the expectation of the average probability of error is bounded above by e for all e > 
and sufficiently large n. 



The proof of Corollary proceeds exactly as the proof of Theorem 4^, but all 
the typical projectors are chosen conditionally on Q", and we take the expectation 
over in the error analysis. The statement of the QMAC capacity rates using coded 
time-sharing will be important for the results in Chapter [5j 



4.3.1 Conjecture for three-sender simultaneous decoding 

We now state our conjecture regarding the existence of a quantum simultaneous decoder 
for a classical-quantum multiple access channel with three senders. We focus on the 



case of three senders, because this is the form that will be required in Section 5^ for 
the achievability proof of the quantum Han-Kobayashi achievable rate region |HK81 
ISenl2a] . 

Conjecture 4.1 (Three-sender quantum simultaneous decoder). 

Let {Xi X X2 X X^, Pxi,x2,x3, be a classical- quantum multiple access channel with 

three senders. Let pxi,Px2 Pxs be distributions on the inputs. Define the fol- 
lowing random code: let {X^(mi)}mie{i....,\Mi\} be an independent random codebook 
distributed according to the product distribution px^ and similarly and independently 
let {X2{m2)}m2£{i,...,\M2\} '^^'^ {-^3 ('^3)}m3e{i,...,|A43|} be independent random codebooks 
distributed according to product distributions px^ and pxr^ ■ Suppose that the rates of 
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the codebooks obey the following inequalities: 

i?i</(Xi; 51X2X3)^, 

i?2</(X2; 51X1X3)^, 

i?3</(X3; 51X1X2)^, 
i?i + i?2</(XiX2;5|X3)^, 
Ri + Rs<I{X,X^;B\X2)p, 
R2 + R3<I{X2Xs;B\Xi)^, 
R, + R2 + R3<I{X,X2Xs;B)^, 

where the Holevo information quantities are with respect to the following classical- 
quantum state: 

pX,x,x,B^ J2 pxAxi)px,{x2)pxAx^)^ (4.46) 

\xi){xi\ ^®\x2){x2\ '®|a;3Xx3| 



Then there exists a simultaneous decoding POVM {Ami,m2,m3}mi m2 ms ^'^^h that the 
expectation of the average probability of error is bounded above by e for all e > and 
sufficiently large n: 

\ |A^l||A^2||A^^I ~ Ami,m2,m3)PXr(mi),X2"(m2),X3"(m3)] \ < C, 



nii,m,2,m3 



where the expectation is with respect to X", Xj , and X3 . 

The importance of this conjecture stems from the fact that it might be broadly use- 
ful for "quantizing" other results from classical multiuser information theory |FHS"'"12] . 
Indeed, many coding theorems in classical network information theory exploit a simul- 
taneous decoding approach (sometimes known as jointly typical decoding) |EGK10] . 
Also, Dutil and Hayden have recently put forward a related conjecture known as the 
"multiparty typicality" conjecture |Dutllaj . and it is likely that a proof of Conjec- 



ture |4.1| could aid in producing a proof of the multiparty typicality conjecture or vice 
versa. The notion of a multiparty quantum typicality also appears in the problem of 
universal state merging |BBJllj . Recent progress towards the proof of this conjecture 
can be found in |Senl2b] . 
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The conjecture naturally extends to M-senders, but we have described the three- 
sender case because this is the form that will be required for the Han-Kobayashi strat- 



egy discussed in Section 5.3 



4.4 Rate-splitting 



Rate-splitting is another approach for achieving the rates of the classical multiple access 
channel capacity region |GRUWOT] which generalizes readily to the quantum setting 
using the successive decoding approach in |Win01j . 

Lemma 4.1 (Quantum rate-splitting). For a given p = pxi,Px2> ^"^V PC''^i" (-^i) -^2) 
that lies in between the two corner points of the MAC rate region and Pp can be 
achieved if Sender 2 splits her message m2 into two parts m2u and and encodes 
them with a split codebook and a mixing function ({'u"(m2«)}m2„; {'^"(^2?;)}m2„; /)■ 
The receiver decodes the messages in the order ^i|'^2ii ~^ fTi2v\^i^2u using 

successive decoding. The total rate for Sender 2 is the sum R2 = R2U + R2V ■ ^ 

The rate-split codebook consists of two random codebooks generated from pu and 
Pv and a mixing function such that /(f/, V) = X2 |GRUWO"T] |^ The rate splitting cod- 
ing strategy for the two sender quantum multiple access channel consists of a successive 
decoding strategy for the following three channels: 

(f/M^^xr,x«)^pf:^„, (4.47) 

(f/", V\ Xr, X^) ^ (f/", pf : ^„), (4.48) 

(f/-, v\ xr, x^) ^ (f/-, xr, pf: ^„). (4.49) 

The codebooks are constructed with the following rates: 



R2u = I{U- B) - S, (4.50) 
Ri= I{Xi;B\U) -6, (4.51) 
R2, = I{V;B\UXi)-5. (4.52) 



^ Alternately, the mixing can be performed using a switch random variable, S, which is a shared 
randomness resource (denoted [cc]) between Sender 2 and the receiver [RimOlj . 
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Observe that the resulting rate pair {Ri, R2) — {Ri, R2u + R2v) is close to the dominant 
facet of the rate region, which is defined as Ri + R2 — /(X1X2I-B), since: 

Ri + R2 = R2U + Ri + R2V 

= I{U; B)-5 + I{Xi; B\U) - 5 + I{V; B\UXi) - 5 
^ I(XiX2\B) -35. 

By varying the choice of the distributions pu and pv and choosing the rates rates of 
the split-codebooks appropriately, we can achieve all the rates of the dominant facet, 
and therefore all the rates of the region. 

The choice of rate split i?2n ^ R2v depends on the properties of the channel for 
which we are coding. This dependence limits the usefulness of the rate-splitting strat- 
egy in situations where there are multiple receivers. In general, we cannot choose the 
rates of the split codebooks such that they will be optimal for two receivers. Receiver 
1 whose output is the system p^^^,^ would want the rates of the codebooks to be set at 
(i?2M, -R2t)) = iH^'^ Bi), I{V; Bi\UXij), whereas Receiver 2, with outputs p^^^^ would 
want to set {R2U, R2v) = {I{U; B2), I{V; B2\UXij). We will comment on this further 
in the next chapter. 



4.5 Example of a quantum multiple access channel 

We now show an example of a simple quantum multiple access channel for which we 
can compute the capacity region. 

Example 4.2. Consider the channel that takes two binary variables Xi and X2 as 
inputs and outputs one of the four "BB84" states. The following table shows the 
channel outputs for the different possible inputs. 





xi = 


Xi = 1 


X2 


= 






X2 


= 1 


1-)^ 
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4-6 Example of a quantum multiple access channel 

The classical-quantum state on which we evaluate information quantities is 

1 

pX,X,B ^ J2 px,ixi)px,ix2) ki) (Xil^^ ® \X2) {X2f' ® 

Xl,X2=0 

where ip^i X2 '^^'^ '^^ I+X+I '^^ I^X^I depending on the choice of the 

input bits Xi and X2. The conditional entropy H {B\XiX2) ^ vanishes for this state 
because the state is pure when conditioned on the classical registers Xi and X2. We 
choose pxi{xi) and Px2ix2) to be the uniform distribution. This gives the following 
state on Xi, X2, and B: 

|oo)(oo|®|oxo| + |oi)(oi|®|-)(-| + |ioxio|®|+)(+| + |iixii|®|i)(i| . 

From this state we can calculate the reduced density matrix p^^^ = Tr^Jp''^^"^^'^] by 
taking the partial trace over the Xi system: 

loxor^® i(ioxoi + i+x+i)^ + iixir^® ki-x-i + iixii)''] , 

from which we can determine that the conditional entropy H (51X2)^ takes its maxi- 
mum value of if2(cos^ ("""/S)) when px^ {xi) and px2 (2^2) are uniform. 

Taking the partial trace over X2 we obtain the state 

|oxo|^^^ |(|oxo| + l-x-l)"^ + liXir^^ i(l+X+l + 11X11)''] , 

from which we can observe that H{B\Xi) = i72(cos^ (tt/S)). 

Thus, the capacity region for this channel is: 

Ri < i72(cos2(vr/8)) ^ 0.6009, 
R2 < i/2(cos2(7r/8)) « 0.6009, 
Ri + R2< 1. 



„XiX2B _ ^ 

4 
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Figure 4.7: The capacity region for the multiple access channel in Example 



4.2 



4.6 Discussion 



This concludes our exposition on the quantum multiple access channel. The techniques 



used in the proof of Theorem 4.2 are the tools that will be used throughout the re- 



mainder of this thesis. We review them here for the convenience of the reader and in 



order to highlight them in isolation from the technicalities in the proof of Theorem 4.2 
The first idea is the POVM construction with layered typical projectors: 

n^,<5 n^i nj^i,m2 -^^1 n^,5- (4.53) 

We call this a projector sandwich. Observe that the more specific projectors are on the 
inside. Each of the projectors seems to be necessary in some part of the proof, and 
this layering of the projectors ensures that the averaging can be performed. 

The second idea that makes the quantum simultaneous decoder possible is the 
state smoothing trick, which is to perform the error analysis with the unnormalized 
state: 

Pmi,m2 = Pmi,m2 ^m2^ (4.54) 

which is close to the original state, but has the (^2) non-typical parts of it trimmed 
off. 
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4-6 Discussion 



The third idea is to use equation (B.29) in order to obtain the bound 



We will call this the projector trick |(;LM121 ISenl2al IFHS+12] . 



Because of the ah hoc nature of the proof of the two-sender simultaneous decoder, 
the ideas from the two-sender case cannot be applied to show that simultaneous de- 
coding of three or more messages is possible. The techniques used in the proof are 
sufficiently general for the analysis of many problems of quantum network informa- 
tion theory: quantum interference channels (Chapter |5]), quantum broadcast channels 
(Chapter [6]), and quantum relay channels (Chapter [?]). 
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Interference channels 



In an ideal world, when a sender and a receiver wish to communicate, the only obstacle 
they face is the presence of the background noise. Real-world communication scenarios, 
however, often involve multiple senders and multiple receivers sending information at 
the same time and in a shared communication medium. The receivers have to contend 
not only with the background noise but also with the interference caused by the other 
transmissions. The interference channel (IC) is a model for the effects of this crosstalk, 
which occurs whenever a communication channel is shared. 



5.1 Introduction 



Interference is a big problem for all modern multiuser communication systems. In order 
to avoid interference, techniques such as frequency division multiple access (FDMA) 
and time division multiple access (TDMA) can be used to ensure that the senders 
never transmit at the same time and in the same frequency band. Another approach is 
to use code division multiple access (CDMA) and allow users to transmit at the same 
time, but their signal power is randomly spread over large sections of the spectrum so 
as to make it look like white noise. 

Rather than treating the interference as noise, a receiver could instead decode the 
interfering signal and then "subtract" it from the received signal in order to reduce 
(or even remove) the interference. We call this approach interference cancellation, and 
such strategies are the main theme of this chapter. 
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Note that the interference channel problem differs from the multiple access channel 
problem since in this case the multiple access communication is not intended. A receiver 
in the interference channel problem is not required to decode the interfering messages, 
but he will be able to achieve better communication rates if he does so. All the decoding 
strategies discussed in this chapter use some form of interference cancellation as part 
of the decoding strategy. 



5.1.1 Applications 

The interference channel is an excellent model for many practical communication sce- 
narios where medium contention is an issue. 

Example 5.1 (Next-generation WiFi routers). Consider two neighbours who want to 
connect to their respective WiFi routers. Suppose that the communication happens in 
the same frequency band (radio channel). Suppose further that the neighbours' laptops 
are located such that they are close to their neighbour's WiFi router and far from their 
own. In such a situation, the interference signal will be stronger than their own signal. 
Because the interference signal is "masking" the intended signal, it would be possible 
for the neighbours to decode it, and then cancel its effects. Thus, we see that it can be 
to a neighbour's advantage to decode wireless packets which are not intended for him. 
Decoding messages not intended for us can increase the communication rate from the 
intended sender. Note that to implement such a strategy in practice would require a 
re-engineering of the physical layer of transmission protocols. 

Interference also plays an important role in digital subscriber line (DSL) internet 
connections. The twisted pair copper wires of the telephone system were not origi- 
nally designed to carry high frequency and high bandwidth signals, and so there is a 
significant amount of crosstalk on the wires en route to the phone company premises. 
Cross-channel interference is in fact the current limiting factor which imposes speed 
limits on the order of 30Mb/s. The next generation VDSL technology includes the 
G. vector standard, which is essentially an interference cancellation scheme for a vec- 
tor additive white Gaussian channel |GC02^ [OSC^lOj . The use of the new G. vector 



VDSL standard for interference mitigation will allow speeds of up to lOOMb/s to the 
home. 

Interestingly, Shannon's first paper on multiuser communication channels was on 
"Two-way communication channels", which can model the simultaneous transmission 
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of information in both directions over a phone hne |Sha61] . Shannon anticipated the 
importance of NEXT (near-end crosstalk) and FEXT (far-end crosstalk) to communi- 
cation systems fifty years in advance. Clearly, he was a man ahead of his times! 



5.1.2 Review of classical results 

The seminal papers by Carleial |Car78j and Sato |Sat77j defined the interference chan- 
nel problem in its present form and established many of the fundamental results. Find- 
ing the capacity region of the general discrete memoryless interference channel (DMIC) 
is still an open problem, but there are certain special cases where the capacity can be 
calculated. For channels with "strong" |Sat81j and "very strong" |Car75j interference, 
the full capacity region can be calculated. The capacity-achieving decoding strate- 
gies for both of the above special cases require the receivers to completely decode the 
interfering messages. 

For an arbitrary interference channel, it may only be possible to partially decode 
the interfering signal. The Han-Kobayashi rate region T^hk, which is achieved by using 
partial interference cancellation, is the best known achievable rate region for the general 
discrete memoryless interference channel |HK81] . Recently, Chong, Motani and Garg 
used a different encoding scheme to obtain an achievable rate region, T^cmg, which 
contains the Han-Kobayashi rate region |CMG06j . Soon afterwards Kramer proposed 
a compact description of the Han-Kobayashi rate region, T^hk' which involved fewer 
constraints |Kra06j . Han and Kobayashi published a comment regarding the Fourier- 
Motzkin elimination procedure used to derive the bounds |HK07] . but the question 
remained whether the above rate regions are all equivalent or whether one is strictly 
larger than the others. The matter was finally settled by Chong, Motani, Garg and 
Hesham El Gamal, who showed that all three rate regions are in fact equivalent: 

T^HK = T^CMG = T^UKy (5-1) 

when the union is taken over all possible input distributions |CMGEG08j . 

There has been comparatively less work on proving outer bounds on the capacity 
region for general discrete memoryless interference channels |Sat77t ICar83j . 
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5.1.3 Quantum interference channels 

In this chapter, we apply and extend insights from classical information theory to the 
study of the quantum interference channel (QIC): 



-B2^ 



(5.2) 



which is a model for a general communication network with two classical inputs and 
a quantum state p^l^^ output. The classical-quantum interference channel can 
model physical systems such as fibre-optic cables and free space optical communication 
channels |(;SW11] . 

We fully specify a cc-qq interference channel by the set 
of output states it produces {Pxi^i} xi&Xi X2&X2 ^^ch pos- 
sible combination of inputs. Since Receiver 1 does not have 
access to the B2 part of the state p^^^h model his state 
as Px^x2 ~ Trfiafpf/fl] ' where Ttb2 denotes the partial trace 
over Receiver 2's system. Similarly, the output state for Re- 
ceiver 2 is given by pf^^.^ = Tr^^fpf/f^j, 



Txl 



Tx2 




Rxl 



Rx2 



Figure 5.1: The quan- 
tum interference channel 

nBiB2 
rxi,X2 ■ 



A classical interference channel with transition probabil- 
ity function p{yi, y2\xi, X2) is a special case of the cc-qq chan- 
nel where the output states are of the form pfj^f| = "^y-^ y^p{yi,y2\xi,X2)\yi){yi\^''- ® 



\y2){y2\^^ where and {I1/2)} are orthonormal bases of "H^^ and Ti^^. 



5.1.4 Information processing task 

The task of communication over an interference channel can be described as follows. 
Using n independent uses of the channel, the objective is for Sender 1 to communicate 
with Receiver 1 at a rate -Ri and for Sender 2 to communicate with Receiver 2 at a 
rate i?2- 

If there exists an (n, Ri,R2, e)-code for the classical-quantum interference channel, 
then the following conversion is possible: 

^ . _^X^X2^B,B2 (i^) . [^1 ^ ^1] ^ . [^2 ^ ^2]_ 
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£2 






B2" 





Figure 5.2: Diagram showing the parts of a classical-quantum interference channel code for 
n copies of the channel. Sender 1 selects a message mi to transmit (modeled by a random 
variable Mi), and Sender 2 selects a message m2 to transmit (modeled by M2). Each sender 
encodes their message as an n-symbol codeword suitable for transmission over the channel. 
The receivers each perform a quantum measurement in order to decode the messages that 
their partner sender transmitted. 



Note that we are only interested in the communication rates from the sender to the 
intended receiver, and we ignore the communication capacity of the crosslinks: [c^ — )■ c^] 
and [c^ — )• c^]. 

More specifically. Sender 1 chooses a message mi from a message set A^i = 
{l,2,...,|A^i|} where \M.i\ = 2""^\ and Sender 2 similarly chooses a message 1712 
from a message set A^2 = {1,2, ... , |A^2|} where |A^2| = 2"^2_ Senders 1 and 2 en- 
code their messages as codewords Xi{mi) G and ^2(^2) ^ '^2 respectively, which 
are then input to the channel. The output of the channel is an n-fold tensor product 
state of the form: 

Af^-{xUmi),x-{m2)) ^ pff(S),.^(™.) e Vin^"^^). (5.3) 

To decode the message mi intended for him. Receiver 1 performs a positive operator- 
valued measure (POVM) {Am^j^^^i^ |_^^|| on the system 5", the output of which 
we denote M[. For all mi, A^i is a positive semidefinite operator and A^^ = /. 
Receiver 2 similarly performs a POVM {rmjlmaeji \M2\} '^^ system B2, and the 
random variable associated with this outcome is denoted M2. 

An error occurs whenever Receiver I's measurement outcome is different from the 
message sent by Sender 1 {M[ 7^ mi) or Receiver 2's measurement outcome is different 
from the message sent by Sender 2 (Mg 7^ ^2)- The overall probability of error for 
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message pair [mi, 1712) is 



Pe{mi,m2) = Pr{(M(,M2) 7^ {mi, 1712)} 

= Tr|(J — ® r^a) Px|(m\)x5(m2)} ' 

where the measurement operator (/ — A^^ (^Fma) represents the complement of the 
correct decoding outcome. 

Definition 5.1. An {n, i?2, e) code for the interference channel consists of two code- 
books {xi{mi)}mi(,Mi and {xg (m2)}m2e>!25 and two decoding POVMs {AmAmjeMi 
and {TmalmaeA^a' ^^'^^ ^^at the average probability of error is bounded from above 
by e: 



Pe 



\Mi\\M 



pe{mi,m2) <e. (5.4) 



mi,m2 



A rate pair [Ri, R2) is achievable if there exists an {n, Ri — 6,R2 — S, e) quantum 
interference channel code for all e,6 > and sufficiently large n. The channel's capacity 
region is the closure of the set of all achievable rates. 



Interference channel as two disinterested MAC sub-channels 

The quantum interference channel described by {Xi x X2, p^^^^,'H^^ ®'H^^) induces 
two quantum multiple access (QMAC) sub-channels. More specifically QMACi is the 
channel to Receiver 1 given by {Xi x X2,p^^^^^ = TrB^jpf^ifl j ^ ^^Si)^ and QMAC2 is 
the channel to Receiver 2 defined by {Xi x X2, p^^'^ ^^,71^^). Thus, one possible coding 
strategy for the interference channel is to build a codebook for each multiple access 
channel that is decodable for both receivers. For this reason, the coding theorems which 
we developed for quantum multiple access channels in Chapter |4] will play an important 
role in this chapter. 

Note however that the IC problem specification does not require that Receiver 1 
be able to decode 777-2 correctly nor does it specify that Receiver 2 needs to be able 
to decode the message sent by Sender 1 correctly, though most interesting coding 
strategies involve at least partial decoding of the crosstalk messages. If we take the 
logical and of the two MAC subtasks, i.e., we require both receivers to be able to 
decode the messages from both senders, then this communication task is known as the 
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compound multiple access channel problem |Ahl74b] . 



5.1.5 Chapter overview 



In this chapter, we use the theorems from Chapter |4] for quantum multiple access 
channels to prove coding theorems for quantum interference channels. 



In Section 5.2, we prove capacity theorems for two special cases of the interference 



channel. In Theorem 5.1 we calculate the capacity region of the quantum interference 



channel with "very strong" interference (see Definition 5.2) using the successive decod- 



ing strategy from Theorem |4.1 In Theorem 5^2 , we prove the capacity of the channels 



with "strong" interference (see Definition 5.3) using the simultaneous decoding strategy 
derived in Theorem 14.21 



In Section 5.3 we discuss the quantum Han-Kobayashi coding strategy, where the 
messages of the senders are split into two parts so that the receivers can perform 
partial interference cancelation |HK81] . The quantum Han-Kobayashi coding strategy 



(Theorem 5.3) requires the use of quantum simultaneous decoding for multiple access 



channels with three senders which we described in Conjecture |4.1[ 

The main contribution of this chapter is to show that the rates of the Han- 
Kobayashi rate region can be achieved without the need for Conjecture 4.1 We 



will show this in Section 5.4, where we present an achievability proof for the quan- 



tum Chong-Motani-Garg rate region which only uses the two-message simultaneous 



decoding technique from Theorem |4.2[ Recall that the Chong-Motani-Garg region is 
equivalent to the Han-Kobayashi region. 

Note that the achievability of the quantum Chong-Motani-Garg rate region was 
first proved by Sen in |Senl2aj using a different error analysis technique based on an 
intersection projector and a careful analysis of the geometric properties of the CMC 



rate region. The alternate proof given in Section 5.4 uses the simultaneous decoding 



techniques developed in Section |4.3| and an interesting geometric argument by Eren 
§a§oglu [SasOSj . 



The arguments in Section |5.4| show that we can reduce the decoding requirements 
from three-message simultaneous decoding to two-message simultaneous decoding and 
still achieve all the rates in the Han-Kobayashi rate region. Perhaps, it might be 
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possible to remove the need for a simultaneous decoder altogether. Can the Han- 
Kobayashi rate region be achieved using only successive decoding? In Section 5.6, we 



discuss the difference between interference channel codes (both classical and quantum) 
based on successive decoding and those based on simultaneous decoding. In particular, 
we show that rate-splitting strategies based on successive decoding are not a good 
choice for interference channel codes, contrary to what has been claimed elsewhere 



Finally, we obtain Theorem |5.8[ which is a quantum analogue of Sato's outer bound 
for the interference channel. 



5.2 Capacity results for special cases 

In this section, we consider decoding strategies where the receivers decode the messages 
from both senders. We show that this decoding strategy is optimal for the special cases 
of the interference channel with "very strong" and "strong" interference. 



5.2.1 Very strong interference case 

If we use a successive decoding strategy at both receivers, and calculate the best possi- 
ble rates that are compatible with both receivers' ability to decode, we obtain an achiev- 
able rate region. Consider the decoding strategy where Receiver 1 decodes in the decode 
order m2 — )■ mi|m2 and Receiver 2 decodes in the order mi — )■ m2\mi. In this case, 
we know that the messages are decodable for Receiver 1 provided -Ri < I{Xi; Bi\X2) 
and R2 < I{X2;Bi). Receiver 2 will be able to decode provided Ri < /(Xi;i?2) 
and R2 < I{X2;B2\Xi). Thus, the rate pair Ri < min{/(Xi; EijXa), /(Xi; Es)}, 
R2 < min{/(X2; Bi),I{X2; B2\Xi)} is achievable for the interference channel. 

On the other hand, the rate -Ri < I{Xi] Bi\X2) is the optimal rate Receiver 1 
could possibly achieve, since this rate corresponds the message mi being decoded sec- 
ond |Win01j . Similarly the rate R2 < /(X2; -B2|-^i) is an upper bound on the rates 
achievable between Sender 2 and Receiver 2. 

We now define a special class of interference channels, where the achievable rate 
region obtained using the above successive decoding strategy matches the outer bound. 
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Definition 5.2 (Very strong interference). An interference cliannel with very strong 
interference |Car75j , is such that for all input distributions px^ and , 

/(Xi;5i|X2)</(Xi;52), (5.5) 
/(X2;52|Xi) </(X2;5i). (5.6) 



The information inequalities in (5.5)-(5.6) imply that the interference is so strong, 
that it is possible for each receiver to decode the other sender's message before decod- 
ing the message intended for him. These conditions are a generalization of Carleial's 
conditions for a classical Gaussian interference channel |Car75[ lEGKlOj . 



Thus, we can calculate the exact capacity region for the special case of the classical- 
quantum interference channel with very strong interference. 



IC capacity in tiie very strong interference case 




_...;MAC1 
L."_."JmAC2 



Figure 5.3 

the "very strong' 



The capacity region for a cc-qq quantum interference channel which satisfies 
interference conditions (5.5) and (5.6). The figure also shows the capacity 

and 



regions for the multiple access channel problems associated with each receiver: QMACi 
QMAC2. The capacity region for the IC corresponds to their intersection. 
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Theorem 5.1 (Channels with very strong interference). The channel's capacity region 
is given by: 



y <((i?i,i?2)eR^ 



R^ < I{X,;B,\X2Q), 
R2 < I{X2;B2\XiQ), 



(5.7) 



where the mutual information quantities are calculated with respect to a state ffQ^^^^s 
of the form: 



^PQ {q)Px,\Q{xi I q) Px2\q{x2 I q) 



x^){x^\''^0\x2){x2f'®p^^,^, (5.8) 



xi,X2,q 



An intuitive interpretation of this resuh is the seemingly counterintuitive statement 
that, for channels with very strong interference, the capacity is the same as if there 
were no interference jCar75j . 

Proof. We require the receivers to decode the messages for both senders. The average 
probability of error for the interference channel code is given by: 



1 



Pe 



® 



\M,\\M, 
Pe{mi,m2) 



E 



mi ,m2 



Pe{mi,m2) 



Tr 



\^ ^^mi,m2 mi,m2j r x^{mi)x^(m2) 



(5.9) 



where equality ® comes from the symmetry of the codebook construction: it is sufficient 
to perform the error analysis for a fixed message pair (mi, m2). 

Next, we use the following lemma, which is a kind of operator union bound |ADHW09] . 
Lemma 5.1. For any operators < P^,Q^ < I, we have: 



(5.10) 



Proof of Lemma 5.1 Starting from P^ < I and < I, we obtain < (/ — P"^) and 
< (/ — Q^) which can be combined to obtain: 



< (/ - P^) ® (/ - Q 



tAB 



P"" ® - I"" ® Q"" + P"" ® Q". 



B 
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The inequality (5.10) follows by moving the term ® to the left hand side and 



adding a term I^^ to both sides. 



□ 



When applied to the current problem, the inequality (5.10) gives: 



V -'^mi,m2 ^ m,i,m2/ — V ■'^mi,m2/^ ^ ^^-^ mi,m2/ ' 



which in turn allows us to split expression (5.9) into two terms: 



< Tr 

on on 



Tr 



(I - A^" ) ^ ^ 

\^ ^^mi,m2j rxj(mi)a;^(m2) 

(I - A^l" ") O^" 

^^mi,m2) yx^{mi)x^{m2) 



+ Tr^n^j 



+ Trijn 



V-* mi,m2j rx^(mi)x^(m2) 

(T _ F^? ^ 

V"' mi,m2/ rx'^(mi)x'!^{m2) 



Each of the above error terms is associated with the probability of error for one of 
the receivers. The decoding problem for each receiver corresponds to a multiple access 
channel (MAC) problem. We can use the successive decoding techniques from Theo- 
4.1 to show that the decoding at the rates Ri < /(Xi; Bi\X2), R2 < B2\Xi) 



rem 



will succeed. 

Receiver 1 will decode in the order m2 — )■ mi\m2. During the first decoding step 
Receiver 1 decodes the interfering message m2 and we know that this is possible because 



the rate R2 < -Bi), which is guaranteed by (5.5). In the second step, Receiver 1 



now decodes the message from Sender 1 given full knowledge of the transmission of 
Sender 2, which is possible any rate Ri < I{Xi; Bi\X2). Receiver 2 decodes in the 
order mi — m2|mi in order to use full interference cancellation and achieve the rate 
R2<I{X2;B2\Xi). 



The outer bound follows from the converse part of Theorem |4.1[ since the individual 
rates are optimal in the two MAC sub-channels |Car75j . □ 



Example 5.2. We now consider an example of a cc-qq quantum interference channel 
with two classical inputs and two quantum outputs and calculate its capacity region 



using Theorem [sl] [FHSjT2] . The "^-SWAP" channel Af : {0, 1}^ is described 
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Figure 5.4: The capacity region of the "0-SWAP" interference channel for various values 
of 6 such that the channel exhibits "very strong" interference. The capacity region is largest 
when gets closer to 2.18, and it vanishes when = 7r/2 because the channel becomes a full 
SWAP (at this point, Receiver i gets no information from Sender z, where i G {1,2}). 



by: 



00 -> |00)^^^^ , 

01 cos (6) \Olf'^' + sin (6) \10f'^^ , 

10 -> - sin (9) \Olf'^' + cos (9) \ lof'^' 

11 ^ |11)^^^^ 



(5.11) 
(5.12) 
(5.13) 
(5.14) 



We would like to determine an interval for the parameter 9 for which the channel 
exhibits "very strong" interference. In order to do so, we need to consider classical- 
quantum states of the following form: 



X] Pxii^i) Px2{x2) |a;i)(a;i|^' O |a;2)(a;2|^' ® ^Af^ 



B1B2 

X2 ' 



xi,X2=0 



(5.15) 



where ^J^l^l is one of the pure output states in (5.11 )-(5.14). We should then check 



whether the conditions in (5.5)-(5.6) hold for all distributions pxi(a^i) and px2{'^2)- 
We can equivalently express these conditions in terms of von Neumann entropies as 
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follows: 



H{B^\X,X2)^ < H{B,)^ - H{B,\X2 



and thus, it suffices to calculate six entropies for states of the form in (5.15). After 
some straightforward calculations, we find that: 

HiB,\X^X2)= H{B,\X,X,)^ = (px,(0)px,(l) +Px,(l)px,(0))//2(cos2(^)), 
H{B,)^ = H,{pxM + {pxA'^)pxM-PxMPx-A^))sm^ (6)) , 
HiB^)^ = H,{pxM + (px,(0)px.(l) -Px,(l)px.(0))sin2 {6)) , 
H{B2\X,)^ = pxAO) H^ipxA^) cos^ (^)) + H,{pxA^) cos' (9)) , 

HiB^\X2)^ = pxM MPx.a) 00s' (9)) +px,il) H^ipxAO) 00s' (9)) , 

where H2{p) is the binary entropy function. We numerically checked for particular 



values of 9 whether the conditions (5.5)-(5.6) hold for all distributions pxii^i) and 



Px2ix2), and we found that they hold when 9 E [0.96,2.18] U [4.10,5.32] (the latter 
interval in the union is approximately a shift of the ffist interval by vr). The interval 
[0.96, 2.18] contains 9 = tt/2, the value of 9 for which the capacity should vanish because 
the transformation is equivalent to a full SWAP (the channel at this point has "too 



strong" interference). We compute the capacity region given in Theorem 5.1 for several 
values of 9 in the interval 9 E [7r/2, 2.18] (it is redundant to evaluate for other intervals 
because the capacity region is symmetric about 7r/2 and it is also equivalent for the two 



TT-shifted intervals [0.96,2.18] and [4.1,5.32]). Figure 5.4 plots these capacity regions 
for several values of 9 in the interval [vr/2, 2.18]. 



5.2.2 Strong interference case 



The simultaneous decoder from Theorem 4^ allows us to calculate the capacity region 

for the following special case of the quantum interference channel. 

Definition 5.3 (Strong interference). A quantum interference channel with strong 
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interference |Sat81l ICEG87j is one for which the following conditions hold: 

J(Xi;5i|X2)</(Xi;52|X2), 
/(X2;52|Xi)</(X2;5i|Xi), 

for all input distributions pxi and ■ 



(5.16) 
(5.17) 



IC capacity in tiie strong ititerferetice case 



;tv1AC1 

L. J. J MAC 2 



1 1.5 
R , 



Figure 5.5: The capacity region for a cc-qq quantum interference channel which satisfies 



the "strong" interference conditions (5.16) and (5.17). The figure also shows the capacity 
regions for the multiple access channel problems associated with each receiver: QMACi and 
QMAC2. The capacity region corresponds to the intersection. 

Theorem 5.2 (Channels with strong interference). The channel's capacity region is: 



U 





Ri 


< 


{Ri,R2) e Rl 


R2 


< 




Rl + i?2 


< 



/(XiX2;5i|Q), 
J(XiX2;52|g), 



(5.18) 



where the mutual information quantities are calculated with respect to a state 0^^^^^^ 
of the form: 

Zl^'Q(^)P^iiQ(^il^)P^2iQ(^2|g) kiXxil^i ® \x2){x2\''' ^ pI,,,. (5.19) 

Xi,X2,q ^ 

The capacity region is the intersection of the MAC rate regions for the two receivers 
which corresponds to the condition that we choose the rates such that each receiver 
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can decode both mi and m2. See Figure [575 



Proof. The first part of the proof is analogous to the proof of Theorem |5.1| for the 



interference channel with very strong interference. We use Lemma 5A_ to split the error 
analysis for the interference channel decoding task into two multiple access channel 
decoding tasks, one for each receiver. 



The key difference with Theorem 5.1 is that for the strong interference case, we 



require the decoders to use the simultaneous decoding approach from Theorem |4.2| and 
coded time-sharing codebooks as described in Corollary [4.1 The rate pairs described 



by the inequalities in (5.18) are decodable by both receivers. Therefore, these rates are 



achievable for the interference channel problem. 



The proof of the outer bound for Theorem 5.2 follows from the outer bound in 



Theorem 4.1 and an argument similar to the one used in the classical case |CEG87j 
(see also |EGK10l page 6-13]). □ 



5.3 The quantum Han-Kobayashi rate region 

For general interference channels, the Han-Kobayashi coding strategy gives the best 
known achievable rate region |HK81] and involves partial decoding of the interfering 
signal. Instead of using a standard codebook to encode her message mi. Sender 1 
splits her message into two parts: a personal message mip and a common message 
mic. Assuming that Receiver 1 is able to decode both of these messages, the net 
rate from Sender 1 to Receiver 1 will be the sum of the rates of the split codebooks: 
Ri = Rip + Ric- The benefit of using a split codebooliQ is that Receiver 2 can decode 
Sender I's common message mic and achieve a better communication rate by using 
interference cancellation. Because only part of the interfering message is used, we call 
this partial interference cancellation. Sender 2 will also split her message m2 into two 
parts: m2p and m2c- 

Codebook construction: Consider the auxiliary random variables Q, Ui, Wi, U2, W2 
and the class of Han-Kobayashi probability distributions, Vhk, which factorize as 

^ Note that the Han-Kobayashi strategy is also referred to as a rate-splitting in the hterature. 
In this document we reserve this term rate- splitting for the use of a spht codebook and successive 
decoding as in |GRUW01| and |Rim01j . 
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PHxiq, ui, wi, Xi,U2, W2, X2) = p{q)p{ui\q) p{wi\q)p{xi\ui, Wi)p{u2\q)p{w2\q)p{x2\u2, W2), 
where p{xi\ui, Wi) and p{x2\u2, W2) are degenerate probability distributions that corre- 
spond to deterministic functions fi and /2, : UiXWi — > A^, which are used to combine 
the values of U and W to produce a symbol X suitable as input to the channel. 

We generate the random codebooks in the following manner: 

n 

• Randomly and independently generate a sequence g" according to Y[PQ{qi)- 

i=l 

• Randomly and independently generate 2"^^'= sequences w"(mic), mic G [l : 2"'^i=] 

n 

conditionally on the sequence according to H 

i=l 

• Randomly and independently generate sequences (mip), mip G [l : 2'^^^^] 

n 

conditionally on the sequence according to Yl Pui\Q{iJ'ii\qi)- 

i=l 

• Apply the function fi symbol- wise to the codewords u'^(mic) and ii"(mip) to 
obtain the codeword Xi{mic,mip). 

• We generate the common and personal codebooks for Sender 2 in a similar fashion 
and combine them using /2 to obtain (m2c, ^2p)- 



Decoding: When the spht codebooks are used for the interference channel, we are 
effectively coding for an interference network with four inputs and two outputs. We 
can think of the decoding performed by each of the receivers as two multiple access 
channel (MAC) decoding subproblems. We will denote the achievable rate regions for 
the MAC sub-problems as T^-hk"* '^b.k ■ '^^^ ^^^^ fo'^ Receiver 1 is to decode the 
messages {mip,mic,m2c), and thus the sub-task TZ^^^ corresponds to a three-sender 
multiple access channel, the rate region for which is described by seven inequalities on 
the rate triples (Rip, Ric, R2c)- The decoding task for Receiver 2, Tl^^\ is similarly 
described by seven inequalities on the rates (Ric, R2C, R2p)- 

We perform Fourier-Motzkin elimination on the inequalities of the MAC rate re- 
gions for the two receivers in order to eliminate the variables Rip, Ric, i?2p and i?2c 
and replacing them with the sum variables 

Ri — Rip + Ric, R2 — R-2p + R2c- (5.20) 
At each step in the Fourier-Motzkin elimination process, we use the information the- 



86 



Chapter 5: Interference channels 



oretic properties in order to eliminate redundant inequalities. The result is the Han- 
Kobayashi rate region. 



Theorem 5.3 (Quantum Han-Kobayashi rate region). Consider the region: 


r 


T^HKi-^) = U i(^i'^2) e R^l Eqns. (HKl) - (HK9) } 








D ^ T/TT TT7 . D 1 TT7 

-n.1 < I[UiWi; Bi\W2Q) 


/TJJZ^ \ 

[tilxi ) 




/1J'WO\ 

[ni\Z) 


ti2< 1[U2W2, B2\WiQ) 


(rlKdj 




[riis.4: ) 


TD \ TD ^ T/TT TJ/ TJ/ . D I T/TT . TD ITT7 TT/ 

Ki + K2 < i[UiWiW2, Iji\Q) + I\U2, n2\WiW2Q) 




Ri + R2< I{Ui- Bi\W2WiQ) + I{U2W2Wi- B2\Q) 


(HK6) 


Ri + R2< I{UiW2\ B^\WiQ) + I{U2W^-B2\W2Q) 


(HK7) 


2i?i + R2< I{Ui- B^\WiW2Q) + I{U2W^- B2\W2Q) 




+I{U^W^W2;B^\Q) 


(HK8) 


Ri + 2R2 < HU1W2; Bi\WiQ) + I(U2] B2\W2WiQ) 




+ 1(^72^2 w^i;52|g) 


(HK9) 


where the information theoretic quantities are taken with respect to 


a state 






'^PQ{q)PUr\Q{ui\q)pu2\Q{u2\q) PWr\Q{wi\q) Pw2\Q{w2\q) \q){qf^ 




^\u,){u,\^'^\u2){u2f'^\w,){w,r^\w2){^^^^^ 




is an achievable rate region provided Conjecture 4-i holds. 





Each of the inequalities (HK1)-(HK9) describes some limit imposed on the personal 
or common rates of the two senders. For example, (HKl) corresponds to the maximum 
rate at which rriip and rriic can be decoded by Receiver 1 given that he has already 
decoded m2c- Other inequalities correspond to mixed bounds, in which one of the terms 
comes from a constraint on Receiver 1 and the other from a constraint on Receiver 2. 
An example of this is (HK2) which comes from the bound on Receiver I's ability to 
decode rriip (given rriic and m2c) and a bound from Receiver 2's ability to decode rriic 
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W2' 



/2 





pBiB2 

yxi,x2 






X2 


B2 









<U2 

'W2 



Figure 5.6: The random variables used in tlie Han-Kobayashi coding strategy. Sender 1 
selects codewords according to a "personal" random variable Ui and a "common" random 
variable Wi. She then acts on Ui and Wi with some deterministic function fi that outputs 
a variable Xi which serves as a classical input to the interference channel. Sender 2 uses a 
similar encoding. Receiver 1 performs a measurement to decode both variables of Sender 1 
and the common random variable W2 of Sender 2. Receiver 2 acts similarly. The advantage 
of this coding strategy is that it makes use of interference in the channel by having each 
receiver partially decode what the other sender is transmitting. Theorem 5.3 gives the rates 



that are achievable assuming that Conjecture 4.1 holds 



(given m2c and •m2p'f\ 

Note that the original description of the rate region given by Han and Kobayashi 
in |HK81] and later in |HK07j contained two extra inequalities. Chong et al. showed 
that these extra inequalities are redundant, and so the best description of T^hk involves 
only nine inequalities as above |CMGEG08] . 



Proof. The proof is in the same spirit as the original result of Han and Kobayashi 



|HK81j . The first step is to use the Lemma 5.1 to obtain 



\ rnip,m.ic,m2c ^ rn.ic,m2c,m2p J 

< ( - A^" +iB7^( jB5 _ 

which allows us to bound the error analysis for the interference channel task in terms 
of the error analysis for two MAC sub-channels. Our result is conditional on Con- 



jecture 4.1 for the construction of the decoding POVMs for each MAC sub-channel: 



{Amip.mi^maJ Receiver 1, and {Tmi^,m2c,m2p} for Receiver 2. □ 



^ Receiver 2 is not required to decode the common message of Sender 1, but the Han-Kobayashi 
strategy does require this condition despite the fact there could be no interference cancellation benefits 
for doing so, given that Receiver 2 has already decoded the messages m2c and m2p- This should serve 
as a hint that the Han-Kobayashi decoding requirements can be relaxed. We will discuss this furtlier 
in the next section. 
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At the very least, observe that Theorem 5^ depends on Conjecture for its 
proof. While we do not doubt that the conjecture will ultimately turn out to be true, 
the fact remains that our result is conditional on an unproven conjecture, which is 
somewhat unsatisfactory. 

In order to remedy this shortcoming, we searched for other approaches which 
could be used to prove that the rates of the quantum Han-Kobayashi rate region are 
achievable. First, we proved that the quantum Han-Kobayashi rate region is achievable 
for a special class of interference channels where the output states commute. We also 
derived an achievable rate region described in terms of min-entropies |Ren05l IToml2j , 
which is in general smaller than the Han-Kobayashi rate region. These results are 
well documented in |FHS"'"12] . Another approach which we studied is the use of a 
rate-splitting and successive decoding approach in order to achieve the rates of the 
Han-Kobayashi rate region. We attempted to adapt the results of §a§oglu in |Sas08] . 
which claimed, erroneously, that the rate-splitting strategy can be used in order to 
achieve the Chong-Motani-Garg (CMG) rate region. Recall that the Chong-Motani- 
Garg rate region is equivalent to the Han-Kobayashi rate region |CMGEG08j . In fact, 
as we will see shortly, the Chong-Motani-Garg approach is simply a specific coding 
strategy to carry out the Han-Kobayashi partial interference cancellation idea. 

The analysis in |Sas08j is in two parts. The first part is a geometric argument, 
henceforth referred to as the §a§oglu argument, which shows that there is a many-to-one 
mapping between the rates of the split codebooks (-Rip, -Ric, -R2C, R2p), and the resulting 
rates (-Ri,i?2) for the interference channel task. In the second part of the analysis, 
§a§oglu describes a strategy for the use of rate-sphtting and successive decoding for 
the common message. The common-message codebook for one sender is split so as to 
accommodate one of the receivers assuming the common-message codebook of the other 
sender is not split. However, if both users split their common-message codebooks, the 
rates cannot be chosen, in general, so as to achieve all the rates of the Chong-Motani- 



Garg rate region. We will comment on this further in Section 5.6 



While rate-splitting and successive decoding turned out to be a dead end in our 
quest for the quantum Hon-Kobayashi region, the §a§oglu argument and the use of two- 
sender simultaneous decoding turns out to be sufficient in order to show the achiev- 
ability of the quantum Chong-Motani-Garg rate region. This will be the subject of 
Section 15.51 below. 
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5.4 The quantum Chong-Motani-Garg rate region 

The achievability of the quantum Chong-Motani-Garg (CMG) rate region was recently 
proved by Sen using novel geometric ideas for the "intersection subspace" of projectors 
and a "sequential decoding" technique |Senl2a] . In this section we will describe the 



CMG coding strategy and state Sen's result in Theorem 5.4 In Section 5.4, we will 



provide an alternate proof of this result based on the §a§oglu argument |Sas08j and 



the two-sender simultaneous decoding techniques from Theorem 4.2 



The differences between the Chong-Motani-Garg coding strategy and the Han- 
Kobayashi coding strategy are: (1) the different way the senders' codebooks are con- 
structed and (2) the relaxed decoding requirements for the two receivers. We discuss 
these next. 



Codebook construction: The codebooks are constructed using the superposition 
coding technique, which was originally developed by Cover in the context of the classical 
broadcast channel |Cov72j . The idea behind this encoding strategy is to first generate a 
set of cloud centers for each common message and then choose the satellite codewords 
for the personal messages relative to the cloud centers. 

Let Q, Wi, W2 be auxiliary random variables and let "Pcmg be the class of probabil- 
ity density functions which factorize aspcMciQ, "W^i) ^i: X2) = p{q) p{wi\q) p{xi\wi, q) 
p{w2\q) p{x2\w2,q)- To construct the codebook we proceed as follows: 

n 

• First randomly and independently generate a sequence g" according to H Pgiqi)- 

i=l 

• Randomly and independently generate 2"''^^'= sequences ti'"(mic), mic G [1:2"'^^' 

n 

conditionally on the sequence g" according to Yl PWi\Q{'u^u\qi)- 

1=1 

• Next, for each message mi^, we randomly and independently generate 2"^^^ con- 
ditional codewords x^(mip|mic), mip E [l : 2"^^^] , rriic E [2"^^'=] according to the 

n 

product conditional probability distribution Y[Pxi\WiQ{xii\wii{mic),qi). 

i=l 

• We generate the common and personal codebooks for Sender 2 in a similar fash- 
ion. First generate {1^2(^20)}, '^2c £ [2"^^=] according to n'^Pw^2|Q then 
generate (m2p|m2c)}, fn2p E [2"'^'^p], m2c E [2^'^^"] conditionally on W2("^2c) 
according to IT Px2\W2Q- 
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Decoding for the MAC subproblems: The decoding task for each of the receivers 
is associated with a multiple access channel subproblem. We will denote the achievable 
rate regions for the MAC sub-problems for a fixed input distribution pcMG £ Vcmg as 
"^cmgI-^^Pcmg) and 7^cmg(-^>Pcmg). 

Consider the decoding task for Receiver 1. The messages to be decoded are 
(mip, mic, ?77.2c), while the effects of the message m2p superimposed on top of the code- 
word for are considered as noise to be averaged over. The desired achievable rate 
region '7?.cj^q(A/',pcmg) is defined as follows: 

IJ {{Rip, Ric, R2c) e Eqns (al)-(dl) below} 

p(xi\wi,q)p(wi\q) 

Rip < I (Ai; Bi\W,W2Q) = /(ai), (al) 

Rip + Ru<I{Xi;B,\W2Q) = Hbi), (bl) 

Rip + R2c<IiXiW2;B,\WiQ) = Hci), (cl) 

R,p + + R2c<I {X^W2; B,\Q) ^ I{di). (dl) 

The mutual information quantities are calculated with respect to the following state: 

^p{q)p{wi\q)p{xi\wi,q)p{w2\q) X (5.21) 

q.-^i, 

where 

P^U ^ J2p(^-M "^b^ [p^^S] (5-22) 

is the effective code state for Receiver 1. It is the average over the random variable 
A2 (since we treat m2p as noise) and the partial trace over the degrees of freedom 
associated with Receiver 2. 

The rate region for Receiver 2 is similarly described by: 
^cmg(-A^,Pcmg) = U {(i?2p,i?2c,i?ic) e I Eqns (a2)-(d2) below} (5.23) 

p(xi\wi,q)p(wi\q) 
p{x2\iU2,q)p('ui2\l)p(l) 
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R2p < I (X2; B2\WiW2Q) = /(aa), (a2) 

R2p + R2c<nX2;B2\WiQ) ^I{h), (b2) 

R2p + Ric<nX2Wi;B2\W2Q) =Hc2), (c2) 

R2p + R2c + Ric<I {X2W1; B2\Q) = I{d2), (d2) 

with respect to a code state in which the variable Xi is treated as noise and a partial 
trace over the system Bi is performed. 



Observe that the above MAC rate regions are described only by four inequalities, 
rather than by seven inequalities like the multiple access channel with three senders 



(cf. Conjecture 4.1 ). Two of the rate constraints do not appear because we are using the 
superposition encoding technique and always decode mic before mip. A third inequality 
can be dropped if we recognize that Receiver 1 is not really interested in decoding m,2c', 
he is only decoding m2c to serve as side information which will help him decode the 
messages rriic and mip intended for him. This is called relaxed decoding, and allows 
us to drop the constraint associated the decoding of after niic and mip |CMG06j . 
The relaxed decoding approach cannot be applied directly to the quantum case, and so 
a different decoding strategy is required |Senl2aj . We postpone the discussion about 
the decoding strategies of the receivers until the end of this section. 



We are now in a position to describe the Chong-Motani-Garg rate region T^cmg; 
which is obtained by combining the constraints from T^cmg ^cmg- Recall that, 
for the interference channel problem, we are interested in the total rates achievable 
between each sender and the corresponding receiver. For Receiver 1, we have a net 
rate of Ri = Ric + Rip and similarly for Receiver 2 we have R2 = R2C + R2p- Consider 
the projection 11 which takes the 4-tuple of rates (-Rip, -Ric, Ric, Rip) to the space of 
net rates (-Ri, R2): 

'Ri 
R2 



The Chong-Motani-Garg rate region for the interference channel is obtained by taking 
the union over all input distributions of the intersection between the two MAC rate 



-Rip + -Ric 
R2p + -R2C 



110 
11 



n 



Rip 
Ric 
R2C 

R2p 



(5.24) 
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regions, followed by the projection 11 to obtain: 

7^CMG(A^)=^I U 7^^MG(-^>PCMG) n 7^2^G(A^,pcMG) j • (5.25) 
VpcmgG'Pcmg / 

Equivalently, it is possible to compute the intersection of the two MAC rate regions 
by performing Fourier-Motzkin elimination on the inequalities from equations (al)-(dl) 
and (a2)-(d2). By taking all possible combinations of the inequalities in the two MAC 
subproblems, we obtain the equivalent set of inequalities in the two dimensional space 

i?2)- The resulting achievable rate region has the following form: 

Theorem 5.4 (Quantum Chong-Motani-Garg rate region |Senl2aj ). The following 
rate region is achievable for the quantum interference channel: 

TIcmgW = U {{Ri,R2) ^'^W Eqns. (CMG1)-(CMG9) hold. } (5.26) 

p(xj|ujj,q)p(ujj|q) 
p(^2\^2<<l)p(.™2\l)p{<l) 



Ri<I{Xy,B^\W2Q) (CMGl) 

i?i < /(Xi; B^\W^W2Q) + /(Xsl^i; B^lW^Q) (CMG2) 

R2<I{X2\B2\W^Q) (CMG3) 

Ri < I{XiW2; Bi\WiQ) + I{X2] B2\WiW2Q) (CMG4) 

Ri + R2< I{XiW2; BiQ) + I{X2] B2\WiW2Q) (CMG5) 

Ri + R2< I{Xi; Bi\WiW2Q) + I{X2Wi; B2Q) (CMG6) 

i?i + i?2 < IiX^W2; B,\WiQ) + /(XsVTi; ^slW^sQ) (CMG7) 



2i?i + R2< I{X^W2; B,\Q) + I{X,- B^\WiW2Q) + /(Xa^; ^sjW^sQ) (CMG8) 
Ri + 2R2 < /(X2; B2\WiW2Q) + I{X2Wr, B2\Q) + I{X^W2; B^\WiQ) (CMG9) 

where the information theoretic quantities are taken with respect to a state of the form 
QQW1X1W2X2B1B2 ^ 

Pq{(1) PWi\Q{wi\q) Pw2\Q{w2\q) PXi\WiQ{Xl\Wi,q)px2\W2Q{x2\w2,q) 

q,wi,W2, 

The classical CMC rate region is known to be equivalent to the Han-Kobayashi 
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rate region |CMGEG08] . Thus, Sen's achievability proof for the rates of the Chong- 
Motani-Garg rate region is also a proof of the quantum Han-Kobayashi rate region. 



Quantum relaxed decoding 

Let us consider more closely the relaxed decoding approach that is employed by Re- 
ceiver 1 in the classical case. The decoding strategy for Receiver 1 is to use jointly 
typical decoding and search the codebooks {w"(mic)}, {x"(mip|mic)} and {ti?2 (^2c)} 
for messages (mic, mip, m2c) such that 



If such messages are found, the decoder will output nii = (mic,mip). This decoding 
is relaxed because the above condition can be satisfied for some m2c which is not 
necessarily the correct m2c transmitted by Sender 2. 

The use of the relaxed decoding strategy allows us to drop the following constraint: 



which corresponds to the message m2c being decoded last, given the side information 
of and mip. 

The relaxed decoding strategy does not generalize readily to the case where a quan- 
tum decoding is to be performed |Senl2a] . For each message triple (mic, mip, m2c), we 
could define the measurement {Amic,mip,m2c}) but how does one combine the measure- 
ment operators {-^mic,mip,rh2c}y G p^^^c] form a "relaxed measurement"? Indeed, 
the usual quantum measurements we use are ones that "ask specific questions" and 
for which one outcome is more likely than the others. This allows us to use the gen- 
tle operator lemma which tells us that the our measurement disturbs the system only 
marginally. 

Sen sidestepped the difficulty of asking a "vague" question by using two different 
decoding strategies depending on which rates we want to achieve. Receiver 1 will 
either decode or ignore it altogether. The set of achievable rates for Receiver 1 




R2c<HW2;Bi\WiXi), 



(5.27) 
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{Rip, Ric, R2c) £ 1R+ obtained by Sen is described as follows: 





R2C 


< 


IiW2;Bi\Xi), 












Rip 


< 


I{Xi;Bi\WiW2), 


R2C 


> 




Bi\Xi) 


Ric 


+ Rip 


< 


I{Xi;Bi\W2), 


OR Rip 


< 


I{Xi 


■■ Bi\Wi 


R2c 


+ Rip 


< 


I{XiW2;Bi\Wi), 


Ric + Rip 


< 


HXi- 


Bi). 


Ric + R2C 


+ Rip 


< 


I{XiW2;Bi), 











Note that the region is not convex. To achieve the rates on the left hand side, Sen 
developed a novel three-sender simultaneous decoding measurement. The rates on the 
right hand side correspond to a disinterested MAC problem, in which the message m2c 
will not be decoded. After taking the intersection of the achievable rate regions for 



Receiver 1 and Receiver 2 and applying the projection as in (5.25), Sen obtained a 



region which is equivalent to the quantum CMG rate region jSenl2a] . 

In the next section we will describe another route to prove the achievability of the 
quantum CMG rate region. We will show that the use of three-sender simultaneous 
decoding is not necessary. Each of the receivers will use one of three different decoding 
strategies that only require two-sender simultaneous decoding and, in combination, 
these decoding strategies achieve all the rates (i?i,i?2) £ '^cmg(-^!Pcmg)- 



5.5 Quantum CMG rate region via two-sender si- 
multaneous decoding 

In the original Han-Kobayashi paper |HK81j and the subsequent Chong-Motani-Garg 
papers |CMG06l ICMGEG08] , the decoding strategy is to use the three-sender simulta- 
neous decoder. This strategy allows for all possible interference cancellation scenarios. 
An example of a specific decoding strategy would be to decode the interference mes- 
sage m2c simultaneously with mic and then decode mip last using the side information 
from both common messages. We denote this {mic,m2c) — )■ mip\micm2c- Another 
example would be to decode rriip and m2c simultaneously after having decoded rriic 
first: rriic — > {mip\mic, wi2c|'^ic)- Simultaneous decoding is a catchall strategy that 
subsumes all of the above specific strategies. However, as we saw in Chapter |4| the 
existence of a simultaneous decoder for a general three-sender QMAC is still an open 
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problem (Conjecture 4.1). It would therefore be desirable to find some specific quan- 
tum decoding strategy (or a set of strategies like in |Senl2aj ). which can be used to 
achieve all the rates of the quantum CMG rate region. 

In this section, we will extend the geometrical argument presented in |Sas08j . to 
do away with the need for the simultaneous decoding of three messages. We will show 



that the quantum two-sender simultaneous decoder from Theorem AJ2_ is sufficient to 
achieve the quantum Han-Kobayashi rate region. 



Observe that in equation (5.24) only the sum rate Ri^ + -Rip is of importance for 
Receiver 1. The relative values of -Ric and -Rip are not important — only their sum 
(provided that all the inequalities (al)-(a4) are satisfied). This fact implies that we 
are allowed a certain freedom in the way we choose the rates of the codebooks for the 
interference channel. We define this freedom more formally as follows: 

Definition 5.4 (Rate moving operation). Let Pcmg be the probability distribution 
used to construct CMG codebooks. Let C and C be two codebooks with rates 

C : (-Rip, -Ric, Ric-, Rip) (5.28) 
C : iRip + 5i,Ric-5i,R2c-S2,R2p + 52), (5.29) 

such that the rates of both codebooks satisfy all the inequalities (al)-(dl) and (a2)-(d2), 
then they achieve the same rate pair {Ri,R2) G 7^cmg(A/',Pcmg)- Such a transforma- 
tion of rate tuples is called a rate moving operation. 

In words, we say that to achieve the rate pair (-Ri, R2) for the interference channel, 
we are free to move the rate points so as to decrease the common rates and increase 
the personal rates. Intuitively, such a transformation is interesting because decreasing 
the common rates will make the decoding task easier overall, since both receivers have 
to decode the common messages whereas only a single receiver needs to decode the 
personal part. The idea for this rate moving operation is due to Eren §a§oglu |Sas08] . 



To show the achievability of the Chong-Motani-Garg rate region, 7^cmg(A/'), it is 
sufficient to show that we can achieve points on the boundary of the region, which 
we will denote as dTZcuci-^)- In fact, it is sufficient to achieve points on the non- 
vertical, non- horizontal boundary of the rate region which we will denote d'TZcuci-^) ^ 
dTZcMci-^)- This region is illustrated in Figure 5.7 (b). We refer to the facets that 
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make up the d'TZcuciJ^) as the dominant facets of the CMG rate region in analogy 
with the dominant facet of the multiple access channel capacity region. 

We now state the main theorem of this section: 
Theorem 5.5 (The dominant facets of the QCMG are achievable). Any rate 
pair (i?i,i?2) £ d'TZcMci.-^ iVcmg) of the non-horizontal, non-vertical facets 
of the CMG rate region is achievable for the quantum interference channel 

As a corollary of the above theorem, we can say that the quantum Chong-Motani- 
Garg rate region is achievable. Any point in the interior of the CMG rate region 
'^cmg(-^;Pcmg)) is dominated by some point on the non-vertical, non-horizontal dom- 
inant facets of the boundary 9'7^cmg(-^)Pcmg)- Therefore, we can achieve all other 
points of the rate region by resource wasting. 



The Ctiong-Motani-Garg rate region R The Chohg-Motahi-Garg rate reeien R 




-0.2 0,2 QA 0.6 0,8 1 ' -0.2 0,2 QA 0.6 0,8 1 



(a) The CMG achievable rate region. (b) The non-horizontal, non-vertical dominant 

facets of the CMG rate region, (9'7?,cmGj which 
are achievable by two-sender simultaneous de- 
coding, are shown in bold. 

Figure 5.7: The CMG achievable rate region for a given input distribution 
p{q)p{wi, xi\q)p{w2, X2\q) in general has the shape of a heptagon. The region is bounded 
by the two rate positivity conditions and each of the other facets corresponds to one of the 
inequalities (CMG1)-(CMG9). 



The proof of Theorem 5.5 is somewhat long, so we have broken it up into several 



lemmas. Below we give a brief sketch of the steps involved: 

• In Section |5.5.1[ we will discuss the geometry of the achievable rate regions 



"^cmgI-^'Pcmg) and '^cmg(-^;Pcmg) for the two receivers. We state Lemma 5.2 



which identifies the relative placement of the inequalities (al)-(dl) by using the 
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properties of mutual information quantities /(oi) through I{di). 



In Section 5.5.2, we will show that any rate pair {Ri,R2) G d'lZcuG can be 
achieved using codebooks with rates that lie either on the (a) or (c) planes of 



the MAC rate regions. To show this statement, we will prove Lemma [5.3| which 
describes a procedure in which we use rate moving to transfer any rate point on 
the (b) or (d) planes to an equivalent rate point on the (a) or (c) planes. 



In Section |5.5.3[ we prove that the receivers can use two-sender quantum simul- 
taneous decoding to achieve any rate on the planes (a) and (c). More precisely, 
there are three possible decode orderings that may be used. Lemma 5^ shows 
that the following three decoding strategies (shown for Receiver 1) are sufficient 
to achieve the rates in the CMG rate region: 

Case a: (mic,m2c) mip\mijn2c, 

Case c: niic — )■ (mip|mic, Tn2c\fnic), 

Case c': rriic — t- mip\mic. 



5.5.1 Geometry of the CMG rate region 

For a general input distribution PcmG; the CMG rate region T^cmgIA/", Pcmg) and the 
two MAC subproblem rate regions could take on different shapes depending on the 
relative values of the mutual information quantities /(ai), /(&i), I{ci), I{di), /(a2), 
7(62), /(C2) and 1(^2). 

In his paper |Sas08] . §a§oglu develops a powerful intuition for dealing with the 
polyhedra that describe their boundaries 97?.cmg(A/',Pcmg)5 dTZ^^Q^Af , Pcmg) and 
OTZq^q^M , Pcmg) ■ Define the two-dimensional facets ai,bi,Ci,di which make up the 
region boundary. Each facet is a subset of the plane in IR^ associated with the equality 
condition of inequalities (al), (bl), (cl) and (dl), which correspond to the rate con- 
straints of Receiver 1. The boundary of the region "/^cmgI-^'Pcmg) can be written as 
^^^cmgI-^'Pcmg) = ai U 61 U Cl U di. 

We can visualize the three dimensional rate region 7^cmg(-^'Pcmg) as in Figure 
15.81 below. 

This shape of the rate region is governed by the information-theoretic quantities 
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Figure 5.8: The achievable rate region '7^cmg(-^'?'cmg) and its bounding facets ai,6i,ci, 
and di. Each surface is associated with the equality condition in one of the equations (al), 



(bl), (cl) and (dl) from page 91 



on the right hand side of equations (al) through (dl). The following relations establish 
the geometry of the rate-region TZ^j^Q^Af , Pcmg) which hold for any input distribution. 

Lemma 5.2 (Geometry of T^cmgI-^'Pcmg))- The information-theoretic quantities 
from equations (al), (bl), (cl) and (dl) satisfy the following inequalities: 

I{ai)< I{bi) <I{di), (5.30) 
/(ai)< /(ci) </(cii), (5.31) 
/(ai) + /(rfi) < /(&i) + /(ci). (5.32) 

Geometrically /(ai) < /(&i) indicates that the plane containing hi intersects the 
plane containing ai in the positive octant. Similarly /(&i) < I{di) indicates that 



the plane containing di intersects the plane containing bi inside IRi. Equation (5.31) 



dictates that the plane containing ci intersects the plane containing ai and that the 



plane containing di intersects the plane of Ci. Finally, equation (5.32) states that 
/(ai) + I{di) < I{bi) + /(ci), which means that the rate constraint on the sum 2Rip + 
R1C + R2C obtained by adding (al) and (dl) is tighter than the rate constraint obtained 
by adding (bl) and (cl). If we define the sets A = {Ip, Ic} and B = {Ip, 2c} and p{X) 



to be the information-theoretic quantities of the right hand side, then equation (5.32) 



has a super-modular polymatriod structure p{A n B) + p{A U B) < p{A) + p{B). The 



proof of Lemma |5.2| is given in Appendix C.l 
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5.5.2 §a§oglu argument 

Let the rate pair (i?i,i?2) ^ d'lZcMGi.-^ iVcmg) be part of the non-horizontal, non- 
vertical boundary of the two dimensional rate region 7^cmg(A/',Pcmg)- This rate 
pair is associated (non-uniquely) to a pair of points Pi = {Rip, Ric, R2c) and P2 = 
(i?2p, R2C, Ric) on the boundaries of the respective regions 7^cmg(-^' Pcmg) and 7^cmg(-^' Pcmg) 

Claim 5.6. If the two-dimensional rate pair (_Ri,i?2) ^ <9'7?.cmg(A/', Pcmg) is the pro- 
jection of the points Pi = (-Rip, Ric, R2c) and P2 = {R2p, R2C, Ric) via the mapping in 



(5.24), then Pi e 97^^MG(-^>PcMG) and P2 G dnl^^iAf , pcuc) ■ 



Suppose that this were not the case — that is, we assume that at least one of 
the points. Pi is not on the boundary of its region <97^cmg(-^'Pcmg)- Suppose, for a 
contradiction, that Pi is in the interior of 71qj^q{M , pqmg) , then there must exist a ball 
of achievable rates of size 6 around Pj. This means that we would be able to increase 
the private rate to R[p = Rip + 6 for some 6 > 0. The resulting point P/ will be still 
be achievable so long as we stay within the region TZq^q^M , pcmg) ■ However, such a 6 
displacement leads to an increase the sum rate R'i = R'i^ + P'j^ = Rip + 5 + Ric = Ri + S. 
This contradicts our initial assumption that (^1,^2) G 9'7^cmg(A/', Pcmg)- Therefore, 



Claim 5.6 must be true, and this means that it is sufficient to show how to achieve all 



the rates on the boundary of the rate regions dTZ^^Q^Af , pcmg) = ctj U 6j U q U rfj. 
A priori, we have to consider all possible starting combinations of the points Pj G 



aiUbiUCiUdi. However, using the rate moving operation (Definition 5.4), we can move 



any point in 6j U (ij \ U Cj to an equivalent point in Oj U Cj as illustrated in Figure 5.9 



Lemma 5.3 (Moving points [SasOSj). Any point Pi that lies on one of the planes 
biUdi\aiU Ci can be converted to a different point P/ on one of the planes U q, while 
leaving the sum rate (^1,^2) unchanged. 

In order to be precise, we have to study the effects of the rate moving operation 
on both points Pi and P2 simultaneously. This is because the same rates Pic and P2C 
appear in the common coordinates of both Pi and P2. The reasoning behind the proof 
of Lemma 5.3| is reminiscent of the argument used to prove Claim 5^ The details are 



given in Appendix C.2 



Lemma |5.3 is important because in the next section we will show how to achieve 



the rates in the facets Oj and using two-sender quantum simultaneous decoding. 
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Figure 5.9: Moving points on the hi and di facets to equivalent points on ai and ci. 



This means that we can construct a decoder that achieves all the rates for the quan- 
tum Chong-Motani-Garg rate region without the need for a three sender simultaneous 



decoder from Conjecture 4.1 



5.5.3 Two- message simultaneous decoding is sufficient for the 
rates of the facets and q 

In this section we show how to achieve the rates on the ai and Ci facets using only 
two- sender simultaneous decoding. 

Lemma 5.4 (Two-simultaneous decoding for a and c planes). Fix an input distri- 
bution pcMG ^ 'PcMG md let the rate pair (i?i,i?2) ^ dTZcMci-^ ^Vcmg) come from 
the rate triples Pi = {Rip, Ric, R2c) e d7l},^(.{U ,Pcmg) and P2 = {R2p, R2C, Ric) e 
dTicMci-^ ^Pcmg) such that 

(^1, A) e ai Uci X 02 Uc2. (5.33) 

Then the rate (i?i,i?2) is achievable for the QIC using two-sender quantum simultane- 
ous decoding. 

Proof. Our analysis is similar to |Sas08] . but we are not going to use a rate-splitting 
strategy. 
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Achieving points in a: Consider a point Pi G ai, which implies 

i?ip = J(Xi;5i|W^il^2g), (5.34) 

Rip + Ric<I{Xi;Bi\W2Q), (5.35) 

Riv + R2c<I Bi\WiQ) , (5.36) 

Rip + Ri, + R2c<nXiW2;Bi\Q). (5.37) 



We can subtract equation (5.34) from the inequalities below it to obtain a new set 
of inequalities 



Rip = I{Xi;Bi\WiW2Q), (5.38) 

Ric < I (Wi; Bi\W2Q) = I (Xi; Bi\W2Q) - I (Xi; Bi\WiW2Q) , (5.39) 

R2c < I {W2\ Bi\WiQ) = I iXiW2; Bi\WiQ) - I (Xi; Bi\WiW2Q) , (5.40) 

Ric + R2c<I {W1W2; Bi\Q) = I {XiW2; Bi\Q) - I (Xi; Bi\WiW2Q) . (5.41) 



Looking at equations (5.39)-(5.41) we see that the rates {Ric,R2c) have the form 

p{wi\q), W2 ~ p{w2\q) and output Bi. We will 



of a MAC rate region with inputs Wi 
perform the decoding in the following order at Receiver 1: {Wi, W2) — )■ Xi\WiW2 

Consider the quantum channel 



Wi,W2^ 



,W2'> 



(5.42) 



where p^l^^^ is defined as the average output state assuming superposition encoding of 
the random variables Xi and X2 will be performed: 



rwi ,W2 



X2' 



(5.43) 



Xl X2 



The decoding strategy for Receiver 1 when the rates are on the facet ai correspond 



to the use of the two-message simultaneous decoder (Theorem 4.2) on the channel 



shown in (5.42). 



After the common parts have been decoded, Receiver 1 will use a conditional HSW 
decoder to decode the message encoded in Xi. 
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Achieving points in c: Consider a point Pi G ci, which implies that the con- 
straint on the Rip + inequality is tight. 



Rip<nXi;Bi\WiW2Q), 
Rip + Ric<I{Xi;Bi\W2Q), 
Rip + R2c = I{XiW2;Bi\WiQ), 
Rip + Ri, + R2c<I iXiW2; Bi\Q) . 



(5.44) 
(5.45) 
(5.46) 
(5.47) 



If we subtract (5.46) from (5.47) we obtain the following equivalent set of inequalities. 



Rip<IiXi;Bi\WiW2Q), (5.48) 

Rip + Ric<IiXi;Bi\W2Q), (5.49) 

Rip + R2c = I {X1W2; Bi\WiQ) , (5.50) 

Ric < I {Wi; Bi\Q) = I {X1W2; Bi\Q) - I {X1W2; Bi\WiQ) (5.51) 



The constraint on the sum rate Rip + Ri^ imposed by equation (5.49) is less tight 



than the sum rate constraint obtained by adding equations (5.48) and (5.51), therefore 



we will drop equation (5.49) from the remainder of the argument. The accuracy of this 
statement can be verified starting from I{Wi; W2\Bi) > and rearranging the terms. 
See Appendix |C.3| for the details. 



The decoding strategy depends on the position of the point Pi lying within the Ci 
plane. We will treat two cases separately. 



Case c: Suppose Pip is such that: 



IiXi;Bi\WiQ)<Rip. 



(5.52) 



If we subtract this lower bound on Pip from equation (5.50) we can obtain an 



upper bound on R2c- We also have an upper bound on Pip from (5.48) and a 
bound on the sum rate Pip + R2C from (5.50). This gives us the following rate 
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constraints: 



R2c < I B^\x^Q) = - 

Rip + R2c = I{XiW2-,B,\W^Q). 
Ric<nWi;Bi\Q) 



( [5l8| ) 
(5.53) 

( [Ksol ) 

(5.54) 



§a§oglu recognizes the rate constraints on {Rip,R2c) in equations (5.48), (5.53) 
and (5.50 ) to correspond to the dominant facet of a MAC rate region for a channel 
with inputs Xi ~ p{xi\wi, q), W2 ~ p{'W2\q) and output {Wi,Bi). In other words 
we have a special channel where Wi is available as side information for Sender 1 
and Receiver 1. The decode order is given by: Wi — )■ (XilVTi, 14^2 |W^i)- 



To achieve rates on the plane Ci, Receiver 1 will first use a standard HSW decoder 
to decode the message mic encoded in Wi and then apply the simultaneous 
decoding as stated in the following lemma: 
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Lemma 5.5 (Conditional simultaneous decoding). Let {^i (^i)}^ie[2"«ia] be a codebook 
generated according to YVpwi, '^''T'd let {a;"(mi|tf"(£i))}^^gj2"«i;3] ^^g[2'i«ia] be a condi- 
tional codebook generated according to Pxi\Wi- Similarly for Sender 2, we define 
codebooks {^2 (^2)}£2e[2"«2c«] and another {a^g ("^2^2 (^2))}^2g[2"«2/3]^^2g[2nHi„] generated 
according to Pw2 and Y\^Px2\W2- Suppose these codebooks are used on n copies of 
the quantum multiple access channel Pxi,x2> resulting in the map: 

(W", X", 1^2"' -^2 ) ^ Px^\w^,x^\w^- (5.55) 

Consider the case where is known to the receiver, and is considered as noise 
(averaged over). This situation corresponds to the following map: 

where we defined Px^^^n = Exy Px^\w^ x^\w:^' terms of the channel outputs: 

PXY\WP,W^ = ® ^PX2\W2{x2\W2i)pXu,X2- ■ (5.57) 



i=l \ X 



2 



An achievable rate region for the pair (/2i^,i?2a) is described by: 

Rip<I{Xi;B\WiW2), (5.58) 
R2a < HW2; B\XiWi) = I{W2\ 5|Xi), (5.59) 
Rip + < I{XiW2^ B\Wr), (5.60) 

where the mutual information quantities are with respect to the state: 

qW,X,W2B =^p(y;^,a;^)p(y;2)|u;i)(Wl|^l® \W2){W2\^' ® Px,,^,- (5.61) 

«Ji,a;i,t(J2 

Proof. The proof is similar to the two-sender MAC simultaneous decoding from The- 
orem lOi □ 

Case c': Now suppose that Rip < /(Xi; Bi\WiQ), then the trivial successive decoding 
strategy is sufficient. Receiver 1 will decode in the order Wi — )■ Xi. 

The decoding for is done sequentially using HSW decoding. Receiver 1 decodes 
the message first, followed by mip. The decoding in this case is similar to 
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the successive decoding used in Theorem The interfering messages and 
are treated as noise. 

□ 



Thus we see that the combination of Lemma 5.2 , Lemma 5.3 , and Lemma 5.4 shows 



that the quantum Chong-Motani-Garg rate region is achievable using only two-sender 
simultaneous decoding. 



5.6 Successive decoding strategies for interference 
channels 

We report on some results concerning achievable rate regions for the interference chan- 
nel that use the successive decoding approach. 

5.6.1 Time-sharing strategies 



In Section 4.2 on the multiple access channel, we saw that a successive decoding strategy 
can be used to achieve all the rates on the dominant vertices of the rate region. Recall 
that for a fixed choice of encoding distribution p = Pxi(a^i)px2(3;2), the two-sender 
QMAC capacity region has the shape of a pentagon with two extreme points ctp = 
(/(Xi;5),/(X2;S|Xi)) and (3^ = (/(Xi; ^jXa), /(X2; 5)), which correspond to the 
rates achievable by successive decoding in two different orders. To achieve the rates 
in the convex hull of these points, we can use time-sharing between different codes 
achieving these rates. 

Definition 5.5 (Time-sharing). Given two codebooks Ci and C2 with rates corre- 
sponding to rate points ap and Pp and a desired rate point P G conv(Q;p, /3p), we will 
have 

P = top + (1 - t)(3p, (5.62) 

for some t G K, which we call the time-sharing parameter. We can achieve the rates of a 
point P* ^ P if we use the rational time-sharing parameter t* !^ t, t* = ^ E €i and the 
following strategy: during each N block-uses of the channel, use codebook Ci during 
M of them and during the remaining N — M uses of the channel, use codebook C2. 
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The time-sharing strategy is not well-adapted for the interference channel. This 
is because the rates of the corner points of the achievable rate regions for the two 
receivers are not necessarily the same. The time-sharing strategy that works for one 
of the receivers might not work for the other one. 

It is however possible to use successive decoding strategies for an interference chan- 
nel in the following way. We start by considering a strategy where both receivers are 
asked to decode both messages, i.e., we are dealing with the compound multiple access 
channel. Such a strategy defines an achievable rate region known as the "successive 
decoding inner bound" for the interference channel (cf. page 6-7 of Ref. |EGK10j ). 

Consider all possible decode orderings that could be used by the two receivers: 



m2-)-mi|m2, vr2 

m2->mi|m2, tt2 

mi, n2 

mi, 7i2 



m2, 

mi — )■ m2\mi, 
mi — )■ m2\mi, 
m2. 



(5.63) 



Using each of these, we can achieve rates arbitrarily close to the following points: 



Pi = (/(Xi;Si|X2),min{/(X2;Si),/(X2;S2)}), 
P2 = {mm{I{Xi-Bi\X2),I{Xi-B2)}, 

mm{I{X2;Bi),I{X2;B2\Xi)}), 
Pg = {mm{I{Xi-Bi),I{Xi-B2)},I{X2;B2\Xi)), 
P^ = {I{Xi;Bi),I{X2;B2)). 



(5.64) 

(5.65) 
(5.66) 
(5.67) 



We can use time-sharing between these different codes for the interference channel to 
obtain all other rates in conv(Pi, P2, P3, Pa)- This achievable rate region is illustrated 



in the RHS of Figure 5.10 



5.6.2 Split codebook strategies 



We can improve the successive decoding region described in Section |5.6| if we use split 
codebooks. Inspired by the Han-Kobayashi strategy we make the senders split their 
messages into two parts: the messages of Sender 1 will be mip and mic, and the 
messages of Sender 2 will be m2p and m2c- As in the Han-Kobayashi strategy, the use 
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R2 




R2 



Ri 

Simultaneous decoding 




Successive decoding 



Figure 5.10: These plots show achievable rates regions for the interference channel for si- 
multaneous decoding and successive decoding strategies with fixed input distributions. Using 
a simultaneous decoding strategy, it is possible to achieve the intersection of the two regions 
of the corresponding multiple access channels. Using a successive decoding strategy, we ob- 
tain four achievable rate points that correspond to the possible decoding orders for the two 
multiple access channels. The solid red and blue lines outline the different multiple access 
channel achievable rate regions, and the shaded gray areas outline the achievable rate regions 
for the two different decoding strategies. 



of the split codebooks induces two three-sender multiple access channels. Receiver 1 
is required to decode the set of messages mip, mic and m2c using successive decoding, 
and there are six different decode orderings he can use. 



Let the decoding ordering of Receiver 1 be represented by a permutation tti on the 
set three elements {lj9, Ic, 2c}. For example, the successive decoding of the messages 
in the order — ^ mic\^2c ^ip\^icf^2c will be denoted as the permutation tti = 
(2c, Ic, Ip). 



We can naturally use all 6 x 6 pairs of decoding orders to obtain a set of achievable 
rate pairs. 

Proposition 5.7. Consider the rate point P associated with the decode ordering tti 
for Receiver 1 and 1^2 for Receiver 2: 

P = + mm{Ri\ mm{Ri\ 4J} + , 
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where the rate constraints for Receiver j satisfy 



,(2) 



7r,(3) 



< 



oo, 



if 7rj(3) = jc or 7rj(3) = jp 
otherwise 



(5.68) 
(5.69) 

(5.70) 



The rate pair P is achievable for the quantum interference channel, for all permutations 
TTi of the set of indices (Ip, Ic, 2c) and for all permutations 7T2 of the set {2p, 2c, Ic). 




0.2 0.4 0.6 O.i 

Ri 



1.2 1.4 




Figure 5.11: These two figures plot rate pairs that the senders and receivers in a clas- 
sical Gaussian interference channel can achieve using successive decoding and rate-splitting 
(SD-I-RS). The figures compare these rates with those achievable by the Han-Kobayashi (HK) 
coding strategy, while also plotting the regions corresponding to the two induced multiple 
access channels to each receiver (MACl and MAC2). The LHS figure demonstrates that, for 
a particular choice of signal to noise (SNR) and interference to noise (INR) |ETW07] parame- 
ters (SNRl = 1.7, SNR2 = 2, INRl = 3.4, INR2 = 4), successive decoding with rate-sphtting 
does not perform as well as the Han-Kobayashi strategy. The RHS figure demonstrates that, 
for a different choice of parameters (SNRl = 343, SNR2 = 296, INRl = 5, INR2 = 5), the 
two strategies perform equally well. 

The rate region described by the convex hull of the points P is generally smaller 



than the Han-Kobayashi region as illustrated in Figure 5.11 Note that the split- 



codebook and successive decoding strategy works pretty well in the low interference 
regime. An interesting open problem is whether we can achieve all rates of the Han- 
Kobayashi region by splitting each sender's message into more than two parts and 
using only successive decoding. 
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In particular, we want to know whether the capacity of the interference channel 
with strong interference can be achieved using only successive decoding. Alternately, 
it would be interesting to prove that successive decoding is not sufficient in order to 
achieve all the capacity in the strong interference regime for any number of splits and 
any possible decode order. 

We know that the time-sharing, rate-splitting |Sas08j and generalized time-sharing 
|YP11] strategies do not work for the interference channel, but is it possible to show a 
negative result for all successive decoding strategies? This question is explored further 
in jFST2] . 



5.7 Outer bound 

We will close this chapter by giving a simple outer bound for the capacity of general 
quantum interference channels analogous to the classical result by Sato |Sat77j . 

Theorem 5.8 (Quantum Sato outer bound |SavlO] ). Consider the Sato region defined 
as follows: 



nsatoW ^ U {(i?i,i?2) e R^l Eqns ^l^-^l^ below }, (5.71) 

R,<I{X,;B,\X2Q)e, (5.72) 
R2<I{X2;B2\X,Q)e, (5.73) 
Ri + R2< /(X1X2; BiB2\Q)e. (5.74) 

The entropic quantities are with respect to the state 0Q^^^2BiB2 ^ 

Yl PQil)P^i^i\(l)P^i^2\q) \q){q\'^ ® ® ^2) (xsr^ ® pf.^i^^ (5.75) 

q,xi,X2 

Then the region TZ sato{-f^) is an outer bound on the capacity region of the quantum 
interference channel. 

This proof follows from the observation that any code for the quantum interference 
channel also gives codes for three quantum multiple access channel subproblems: one 
for Receiver 1, another for Receiver 2, and a third for the two receivers considered 



1 
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together. We obtain the outer bound in Theorem 5.8 by using the outer bound on the 



quantum multiple access channel rates from Theorem 4.1 for each of these channels. 



5.8 Discussion 

In this chapter we saw how the coding techniques and theorems which we obtained 
in Chapter |4] can be applied to prove coding theorems for the quantum interference 
channel. 

The key takeaway is that interference is not noise, and that it can be advan- 
tageous to the receivers to decode messages in which they are not interested. For 
Receiver 1, knowing the other user's transmissions allows him to increase the rate at 
which he can decodel going from /(Xi; Bi) = H{Bi) — H{Bi\Xi) to the improved rate 
of /(Xi; 5i|X2) = H{Bi\X2) - H{B,\XiX2). 

Because some of our results concerned special cases of the interference channel 
problem, it is worthwhile to review our overall progress towards the characterization of 
the capacity region of the general quantum interference channel Cic{N'). For general 
interference channels we have: 

7^succ(A/') £ 7^si„,(A/') C 7^°K(A/') = T^cMci^f) C CiciAf) C 7^sato(A/')• 

In the special case of the interference channel with very strong interference, the 
rate region achievable by successive decoding achieves the capacity T^succlA/") = Cic(A/'). 
In the special case of strong interference, the rate region achievable by simultaneous 
decoding is optimal TZsimi-f^) = Cic(A/'). 

An interesting research question would be to investigate whether splitting the 
messages into more than two parts, that is, turning the two-user IC into a multiple- 
input multiple-output (MIMO) IC, can improve on the rates that are achievable using 
the Han-Kobayashi strategy. 

In this chapter, we used the superposition coding technique to construct the code- 
books for the CMC coding strategy. We will use this technique again in the next 
chapter in the context of the quantum broadcast channel. 



Ill 
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Broadcast channels 



How can a broadcast station communicate separate messages to two receivers using a 
single antenna? The two message streams must somehow be "mixed" during the encod- 
ing process so that the transmitted codewords will contain the information intended 
for both receivers. In this chapter we apply two codebook construction ideas from 
the chapter on interference channels to build codebooks for the quantum broadcast 
channel. 



The Chong-Motani-Garg construction used superposition encoding to encode a 
'personal" message (satellite codeword) on top of a "common" message (cloud center). 



In Section |6.2| we will use the superposition coding technique to encode a "personal" 
message for one of the receivers on top of a "common" message for both receivers. Such 
a choice of encoding is well suited for broadcast channels where one of the receivers' 
signals is stronger than the other. We can pick the rate of the common message so as to 
be decodable by the receiver with the weaker reception, and use the left-over capacity 
to the better receiver to transmit a personal message for him. The superposition coding 
technique was originally developed in this context |Cov72] . 



Another approach to constructing the mixing of the information streams is to use 
two separate codebooks and an arbitrary mixing function that combines them as in the 



Han-Kobayashi coding strategy. The Marton coding scheme presented in Section 6^ 
uses this approach. 
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6.1 Introduction 



The general broadcast communication scenario with two receivers involves the trans- 
mission of up to three separate information streams. To illustrate the communication 



problem, consider the situation described in Figure 6J- where the television station 
wants to transmit multiple streams of television programming to two separate receivers. 




Figure 6.1: The broadcast channel. The sender wishes to transmit three separate informa- 
tion streams: an EngUsh language TV station for Receiver 1, a French language TV station 
for Receiver 2 and a weather TV station which is of interest to both receivers. 



Suppose that in each block, the antenna has to transmit a common message m G 
[1 : 2"^^] intended for both receivers and personal messages mi G [1 : 2'^-^^] and m2 G 
[1 : 2"^2] each intended for one of the receivers. The task is therefore described by the 
following resource transformation: 

n ■ ^/-^^^i^^ (14) . ^ ^1] + nR-[c^ c^c'] + ni?2 ■ [c -> c% 

What are the achievable rate triples {Ri, R, R2) for this communication task? 

Note that the everyday usage of the word broadcast presumes that only a common 
message is to be transmitted to all receivers. If only a common message is to be 
transmitted, that is, we are looking for rates of the form (0, R, 0), the broadcast channel 
problem reduces to the compound point-to-point channel problem and the capacity is 
given by the minimum of the rates achievable for the receivers. In order to make the 
problem interesting from the information theory perspective, we have to consider the 
case where at least one personal message is to be transmitted. 
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6.1.1 Previous work 

A wide body of research exists in classical information theory on the study of broadcast 
channels. An excellent review of this research is presented in |Cov98] . The broadcast 
channel is also covered in textbooks |CT91l [EGKlll [EGKlOj . In the classical case, two 
of the best known strategies for transmitting information over broadcast channels are 
superposition coding |Cov72[ IBer73t IKM77] and Marton over-binning using correlated 



auxiliary random variables |Mar79j . Sections 6.2 and 6.3 of this chapter are dedicated 



to the generalization of these coding strategies to classical-quantum broadcast channels. 



6.1.2 Quantum broadcast channels 



Previous work on quantum broadcast channels includes jYHDllt IGSE07t IDHLlOj . 
In jYHDllj . the authors consider both classical and quantum communication over 
quantum broadcast channels and prove a superposition coding inner bound similar to 



our Theorem |6.1[ There has also been research on quantum broadcast channels in two 
other settings: quantum-quantum channels |DHL10j and bosonic broadcast channels 
|GSE07] . The Marton rate region for the quantum-quantum broadcast channel was 
developed in |DHL10] . The authors use decoupling techniques jADHWOQl IAHS081 
DuplO| in order to show the Marton achievable rate region with no common message 
for quantum communicatioiiQ 



We define a classical-quantum-quantum broadcast chan- 
nel as the triple: 



(6.1) 




Rxl 



Rx2 



where x is a classical letter in an alphabet X and p^^^^ is 
a density operator on the tensor product Hilbert space for 
systems Bi and B2. The model is such that when the sender 
inputs a classical letter x, Receiver 1 obtains system Bi, and 

Receiver 2 obtains system i?2. Since Receiver 1 does not have access to the B2 part 
of the state pf^^^, we model his state as pf^ = Trsj^x^^^]; where Tr^j denotes the 



Figure 6.2: A quantum 
broadcast channel p^^^"^ ■ 



^ Note that the weU known no cloning theorem of quantum information precludes the possibility of 
a quantum common message: [q — >■ q^q^], where the quantum information of some system controlled 
by the sender is faithfully transferred to two receivers. See [YHDllj for more comments on this issue. 
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partial trace over Receiver 2's system. 



6.1.3 Information processing task 

The task of communication over a broadcast channel is to use n independent instances 
of the channel in order to communicate classical information to Receiver 1 at a rate Ri, 
to Receiver 2 at a rate i?2, and to both receivers at a rate R. More specifically, the 
sender chooses a triple of messages {rrii, 171,1712) G [1 : 2"'^i] x [1 : 2"^] x [1 : 2'"^'^], 
and encodes these messages into an n-symbol codeword x"(mi,m, 777,2) € suitable 
as input for the 77 channel uses. 

The output of the channel is a quantum state of the form: 

AA«'^(a;"(7n„7n, 7772))=p5'(i^„) e V{H^^^S), (6.2) 

where = pf"^^^ ® ■ ■ ■ ® To decode the common message 777 and the 

message 7r7i intended specifically for him, Receiver 1 performs a POVM {A^j^^}, rr7i G 
[1, . . . , |A1i|], 777 e [1, • • • , on the system 5", the output of which we denote 

(M(, M'). Receiver 2 similarly performs a POVM {r^.ma}, 7772 G {1, ... , \M2\}, m e 
[1, . . . , |A1|] on the system , and his outcome is denoted (M", ). 

An error occurs whenever either of the receivers decodes one of the messages in- 
correctly. The probability of error for a particular message triple (7771,777,7772) is 

Pe{mi, 777, 7772) = Tr|(/ - Am,,m ® T^.m^) P^!^Z,m,m2)} , 

where the measurement operator (/ — A^^^^ ® F^^^j) represents the complement of 
the correct decoding outcome. 

Definition 6.1. An (77, Ri, R, R2, e) classical-quantum broadcast channel code consists 
of a codebook {x"^ (7771, 777, 7772)}, 7771 G Aii, m G A4, 7772 G AI2 and two decoding 

POVMs {Amum}^^^M,,m&M ^ ^.^2) m^M.m^^M-, ^^^^ average probability 

of error Pg is bounded from above as 

^^^Wmm ^ ^'e(mi, 777 , 7772)<6. (6.3) 
' ' ' ' ' ' mi,m,m2 

We say that a rate pair (i?i, R, R2) is achievable if there exists an (77, Ri — 5, R — S, R2 
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quantum broadcast channel code for all e,S > and sufficiently large n. 

A broadcast channel code with no common message is a special case of the above 
communication task where the rate of the common message is set to zero: (n,Ri,0,R2i(^) 
Alternately, we could choose not to send a personal message for Receiver 2 and ob- 
tain codes of the form (n, Ri, R,0,e), which is known as the broadcast channel with a 
degraded message set |KM77j . 



6.1.4 Chapter overview 

In this chapter, we derive two achievable rate regions for classical-quantum broad- 
cast channels by exploiting the error analysis techniques developed in the context 
of quantum multiple access channels (Chapter |4]) and quantum interference channels 
(Chapter |5|). 



In Section 6.2, we establish the achievability of the rates in the superposition 



coding rate region (Theorem 6.1). We use a quantum simultaneous decoder at one 
of the receivers. Yard et al. independently proved the quantum superposition coding 
inner bound |YHDllj . but our proof is arguably simpler and more in the spirit of its 
classical analogue |EGK10j . 



In Section 6.3 we prove that the quantum Marton rate region with no common 



message is achievable (Theorem 6.2). The Marton coding scheme is based on the 
idea of over-binning and using correlated auxiliary random variables |Mar79] . The 
sub-channels to each receiver are essentially point-to-point, but it turns out that the 
projector trick technique seems to be necessary in our proof. The Marton coding 
scheme gives the best known achievable rate region for the classical-quantum broadcast 
channel. 



6.2 Superposition coding inner bound 

One possible strategy for the broadcast channel is to send a message at a rate that is 
low enough that both receivers are able to decode. Furthermore, if we assume that 
Receiver 1 has a better reception signal, then the sender can encode a further message 
superimposed on top of the common message that Receiver 1 will be able to decode 
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given the common message. The sender encodes the common message at rate R using a 
codebook generated from a probabiUty distribution pw{w) and the additional message 
for Receiver 1 at rate -Ri using a conditional codebook with distribution Pxiwixl'w)- 
This is known as the superposition coding strategy |Cov72t IBer73j . 

Theorem 6.1 (Superposition coding inner bound). Let W be an auxiliary ran- 
dom variable, let p = Px\w{x\w)pw{w) be an arbitrary code distribution and let 
{X , p^^^'^ be a classical-quantum broadcast channel. The superposition coding 
rate region 7^sc(A^,p) consists of all rate pairs {Ri,R) such that: 

R,<I{X;B,\W)e, (6.4) 
R<I{W;B2)e, (6.5) 
Ri + R<I{X;Bi)e, (6.6) 

is achievable for the quantum broadcast channel. The information quantities are with 
respect to a state Q^^^^^'^ of the form: 

^Pw{w)Px\w{Aw) \w){w\^ ® \x){x\^ ® (6.7) 

E 

The superposition coding strategy allows us to construct codes for the broadcast 
channel of the form (n, Ri,R, 0, e), which have no personal message for Receiver 2. The 
task is therefore described as follows: 

n . j^x~,B,B, 0^) . ^ ^ nR-[c^ c^c% (6.8) 

where [c — > c^c^] denotes the noiseless transmission of one bit to both receivers. 

Proof. The new idea in the proof is to exploit superposition coding and a quantum 
simultaneous decoder for the decoding of the first receiver |Cov72t IBer73j instead of 
the quantum successive decoding used in |YHDllj . We use a standard HSW decoder 
for the second receiver |Hol98l ISW97] . 

Codebook generation. We randomly and independently generate 2"^ sequences 

n 

w"'{m) according to the product distribution nPvK(wi). For each sequence w"(m), 

i=l 

we then randomly and conditionally independently generate 2"^^ sequences x"(mi,m) 

n 

according to the product distribution: Yl Pxiwixilwiim)) . 

i=l 
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POVM Construction for Receiver 1. We now describe the POVM that Receiver 1 
employs in order to decode the transmitted messages. First consider the state we 



obtain from (6.7) by tracing over the B2 system: 



P^^^^ ='^Pw{w) Px\w{x\u!) \w) {w\^ ^ \x) {x\^ ® p^'^ . 
w,x 

Consider the following two averaged states: 

= ^Px"\W"ix''\w'') p^i = j ^px\wix\wi) p^' ] = Ex-|,«"|pjn} , 



1 = 1 \ X 

n 



p^-^ J2 Pw^"(«^"K"|w^"(x"k'^)pfl = (g) 5^p(«^Ma:k)pf^ = E^jp?""}- 



j=l \ w,x 



We now introduce the following shorthand notation to denote the conditionally typical 
projectors witl 
defined above: 



projectors with respect to the output state Pjfn(^^ and the two averaged states 



Receiver 1 will decode using a POVM {Ami,m} defined as the square root measurement: 
based on the following positive operators: 

Pmi,m = n nvK"(m) ^X"(mi,m) ^W"im) H. (6.10) 



Note the projector sandwich structure with the more specific projectors on the 
inside. We have seen this previously in the construction of the simultaneous decoder 
POVM for the quantum multiple access channel. 



POVM Construction for Receiver 2. Consider now the state in equation (6.7) 



from the point of view of Receiver 2. If we trace over the X and Bi systems, we obtain 
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the following state: 



P 



W 1 



where = ^x'Px\w {x\w) pf^. Define also the state 

P = ^Pw{w)px\w{x\w) Px^ ■ (6-11) 



The second receiver uses a standard square root measurement: 
based on the following positive operators: 

where the above projectors are typical projectors defined with respect to the states 
^w-(m) and p®". 

Error analysis for Receiver 1. We now analyze the expectation of the average error 



probability for the first receiver with the POVM defined in (6.9): 



E 



mi,ni ) 



Ml 

m\ ,m 



Due to the above exchange between the expectation and the average and the symmetry 
of the code construction (each codeword is selected randomly and independently), it 
suffices to analyze the expectation of the average error probability for the first message 
pair (mi = 1, m = 1), i.e., the last line above is equal to Ex^w" |Tr | (^-^ "~ ^i^i j i) } } 



Using the Hayashi-Nagaoka operator inequality (Lemma |3.1| on page 34), we obtain 
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the following upper bound on this term: 



E <!Tr 



I - P?^, J } < 2^E^ {Tr {(/ - p5^,,)}} 



+ 4EiiTr{p^,^ P?W)}}- (6.14) 



(mi,m)^(l,l) 



We begin by bounding the term in the first line above. Consider the following 
chain of inequalities: 



x",w" I- 


nx"(i,i) 


- e{ 


PX"(1,1) 


- e{ 


PX"(1,1) 


> 1 - e - 4v^, 



where the first inequality follows from the inequality 

Tr{Ap} < Tr{A(j} + ||p-ct||i, (6.15) 
which holds for all p, a, and A such that < p,a, A < I . The second inequality follows 



from the gentle operator lemma for ensembles (see Lemma 3.2) and the properties of 
typical projectors for sufficiently large n. 



We now focus on bounding the second term of (6.14). We can expand this term 
as follows: 



(mi,m)^(l,l) 



mi^l 



(El) 
fE2l 



mi 



We will now compute the expectation of the first the term, (El), with respect to 
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the code randomness: 

E {(El)} E {Tr{p^„ipji(,,)}} 

= E Tr <^ n Uwnn) Hx^imi,!) ^w^(i) H p^^fi il r 

< 2-[^(sii^^)+5i J]^E^{Tr[n nw.«(i)px«(,r^i,i)ni^"(i)n 



mi^l 



mi^l 



_ 2»l'J<Bil«'x)+«l ^ E{Tr {n Ht^i,<T,y.(,| Hr<i) n aw-w}} 

< 2-[^(Si|iyX)+6]2-n[i/(Bi|H^)-5] ^ E{Tr{n nwn(i) n 

mi^l 

<2n[^^(^il'^W]2-[^(^il'^)-^] J] E{Tr{aH.n(i)}} 
The first inequality is due to the projector trick inequality which states that: 
The second inequahty follows from the properties of typical projectors: 

nw"(i)c^w"(i) ^w^i) < 2~"['^(^^i'^^~''JrV"(i). (6-17) 



We now consider the expectation of the second term (E2) with respect to the 
random choice of codebook. 



E {(E2)}=^ E {Tr{p^,,^pf^,^A] 

mi, 



1) 

mi, 
m^l 
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mi, 



"(mi,m) np^n^„i) 



ni{AM)}n 



mi, 
m^l 



mi, 
m^l 



E { TiwHrn) (mi ,m) ^hv^m) } H 



_ 2 n[i?(Bi) <5] ^^E^Tr [nx"(mi,m)ni4Ai(TO)nnppi(^)] 



mi , 
m^l 



m^^l, mi 

< 2-n[H{B^)-S\ 2n[H{B^\WX)+6] \Ml\\M2\ 



12 ■ 



The equality I(WX] Bi) — I{X; Bi) follows from the way the codebook is constructed 
(the quantum Markov chain W — X — B). This completes the error analysis for the 
first receiver. 

Error analysis for Receiver 2. The proof for the second receiver is analogous to 
the point-to-point HSW theorem. The following bound holds for the expectation of 
the average error probability for the second receiver if n is sufficiently large: 



' ' m 



= E 



(mi,m) 



Putting everything together, the joint POVM performed by both receivers is of 
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the form: Tm\,m ® A^, , and the expectation of the average error probabihty for both 
receivers is bounded from above as 

' ' ' ' mi ,m 

' ' ' ' mi ,m ) 



< E 



mi,m 



4 ^2-"[/(^;Sil^)-25] |_^^|+2-"[^(^'^^)-2^1 \Mi\\M2\] 



where the first inequahty uses the operator union bound from Lemma 5.1 



mi,m ^ m — V in\,m ^ ) ^ ^ ^ ^^m / 

Thus, as long as the sender chooses the message sizes \M.i\ and \M.2\ such that |A^i| < 

2n[I(X;B,\W)-U]^ lA^al < 2'^[^(^'^2)-35]^ ^^^^ |All||A^2| < 2"[^(^'^i)-3'5] , ^J^^^ ^J^g^g g^^g^g 

a particular code with asymptotically vanishing average error probability in the large 
n limit. □ 



Taking the union over all possible choices of input distribution py/xiw^x) gives us 
the superposition coding inner bound: TZsci-Af) = Up , '^sc{-Af,Pwx)- 



6.3 Marton coding scheme 



We now prove that the Marton inner bound is achievable for quantum broadcast chan- 
nels. The Marton scheme depends on auxiliary random variables Ui and U2, binning, 
and the properties of strongljj^ typical sequences and projectors. 

^ The notion of strong typicality or frequency typicality differs from the entropy typicality we have 
used until now. See [Will 11 Section 14.2.3]. 
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Theorem 6.2 (Marton inner bound). Let {p^^^^} he a classical- quantum broadcast 
channel and let x = f{ui, U2) he a deterministic function. The following rate region is 
achievahle: 

Ri < I{Ui;Bi)e, 

R2 < IiU2;B2)e, (6.18) 
R1 + R2 < I{Ui;Bi)e + I{U2;B2)e-I{Ui;U2)e, 

where the information quantities are with respect to the state: 



U2)' 



Ul,U2 



The coding scheme in Theorem 6.2 is a broadcast channel code with no common 



message: (n, Ri, 0, -R2, e)- The information processing task is described by: 

Proof. Consider the classical-quantum broadcast channel {^[{x) = p^^^^}, and a de- 
terministic mixing function: f : V(ixU2 X ■ Using the mixing function as a pre-coder 
to the broadcast channel J\f, we obtain a channel defined as: 

Ar'K,«2) = pf(^^^„,)=pS- (6-20) 



Codebook construction. Define two auxiliary indices £1 G [1 : Li], Li = 2"'^^^'^'^'^^^'^^ 
and £2 E [1 : L2], L2 = 2"'^^^^^'^^^~^^. For each ii generate an i.i.d. random sequence 
m"(£i) according to Pu^iu"^)- Similarly we choose L2 random i.i.d. sequences ^2(^2) 
according to Pu^iW^)- Partition the sequences ^"(^i) into 2""^i different bins -Bmi- 
Similarly, partition the sequences ^2(^2) into 2"^^ bins Cmj- For each message pair 
(mi, 7712), the sender selects a sequence (m" (^1), M2 (^2)) G (-Bmi x Cm2) H •^jj jj^^si such 
that each sequence is taken from the appropriate bin and the sender demands that they 
are strongly jointly typical and otherwise declares failure. The codebook x"'(mi,m2) 
is deterministically constructed from (m"(£i), ^2 (i'2)) by applying the function Xi = 

f{Uli,U2i). 

Transmission. Let (£1, £2) denote the pair of indices of the joint sequence (^"(^i), (£2)) 
which was chosen as the codeword for message (mi, 7712). Expressed in terms of these 
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indices the output of the channel is 

ie[n] 

Define the following average states for Receiver 1: 



U2 Ul 



Decoding. The detection POVM for Receiver 1, {A^^j^^^j^ is constructed by 



using the square-root measurement as in (3.12) based on the following combination of 
strongly typical projectors: 

n;^ ^ n^,, n„.(,,) n^,,. (6.23) 

The outcome of the measurement will be denoted L[. The projectors n„n(£^) and U^^^ 
are defined with respect to the states C0ui{ii) and p*^"" given in (6.22). Note that we use 
strongly typical projectors in this case as defined in [Will 11 Section 14.2.3]. Knowing 
ii and the binning scheme, Receiver 1 can deduce the message mi from the bin index. 
Receiver 2 uses a similar decoding strategy to obtain £2 and infer 7712- 

Error analysis. An error occurs if one (or more) of the following events occurs. 

(EO): An encoding error occurs whenever there is no jointly typical sequence in Bm^ x 
for some message pair {1711,1712)- 

(El): A decoding error occurs at Receiver 1 if L[ 7^ £1. 

(E2): A decoding error occurs at Receiver 2 if 7^ £2- 

The probability of an encoding error (EO) is bounded like in the classical Mar- 
ton scheme |Mar79t lEGKlOt ICov98] . To see this, we use Cover's counting argument 
|Cov98] . The probability that two random sequences m", U2 chosen according to 
the marginals are jointly typical is 2~^^^'^'^'^^^ and since there are 2"'^^^'^'^'^'^^^-^'^^ and 
2n[HU2;B2)-R2] sequences in each bin, the expected number of jointly typical sequences 
that can be constructed from each combination of bins is 

2n[I{Ui;Bi)-Ri]2n[I{U2;B2)-R2]2-"-HUi;U2) 24) 
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Thus, if we choose Ri + R2 + S < I{Ui; Bi) + I{U2; B2) — I{Ui] U2), then the expected 
number of strongly jointly typical sequences in B^i x C*m2 is much larger than one. 

To bound the probability of error event (El), we use the Hayashi-Nagaoka operator 



inequality (Lemma 3.1) 



Pr(El) = -l5^Tr[(/-A,Jp,,,,,] 

^1 V ^ V ' 



(Tl) 



+ 4^Tr[n^^,n„n(,,)n>,,,,] ). 



"(T27 



Consider the following lemma [Will 11 Property 14.2.7]. 



Lemma 6.1. When Ui{ii) and U2{C.2) o,re strongly jointly typical, the state pii^g2 ^■^ 
well supported by both the averaged and conditionally typical projector in the sense 
that: Tr[n^,5 p^^^^J > 1 - e, V£i,£2, and Tr[n„n(£^) p^^^^J > 1 - e> V^a, . 



To bound the first term (Tl), we use the following argument: 

i-(Ti) = Tr[n^_,n„.(,,)n^_, p,,,,,] 



XI -el 



(6.25) 



where the inequalities follow from (6.15) and Lemma 6.1 This use of Lemma 6.1 



demonstrates why the Marton coding scheme selects the sequences ^"(fi) and ^2(^2) 
such that they are strongly jointly typical. 

To bound the second term, we begin by applying a variant of the projector trick 



from (6.16). For what follows, note that the expectation Eui,U2 '^^^^ the random code 
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is with respect to the product distribution Pui'{ui)pu^{u2): 



E {(T2)} = E { T^4^ls^une'.)^ls 

We continue the proof using averaging over the choice of codebook and the properties 
of typical projectors: 



U2 



^nms^iu^HS] ^ ^ Tr n«, E{coe,} His E{p4a} 



^ 2n[H{B.\Ui)+S] £ Tr 



U2 



TllsP^ls ^{Peui.} 



Therefore, if we choose 2"^^ = < 2"[^(^i''^i)~^^l, the probabihty of error will go to 
zero in the asymptotic limit of many channel uses. The analysis of the event (E2) is 
similar. □ 



6.4 Discussion 

We established two achievable rate regions for the classical-quantum broadcast channel. 
In each case a fundamentally different coding strategy was used. 

The superposition coding strategy is a very powerful coding technique for encod- 
ing two "layers" of messages in the same codeword. Recall that the codebooks in 
the Chong-Motani-Garg coding strategy were also constructed using the superposition 
coding technique. In the next chapter, we will use this technique to build codes for the 
relay channel. 
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The binning strategy used in the Marton scheme is also apphcable more widely. 
It can be used every time two uncorrelated messages must be encoded into a single 
codeword. From the point of view of Receiver 1, the messages intended for Receiver 2 
are seen as random noise. By using the correlated variables (f/i, U2) ~ p{ui, U2) to con- 
struct the codebooks we can obtain better rates than would be possible if independent 
codebooks were used. This is because the "noise" codewords are now correlated with 
the messages for Receiver 1 and thus helping him with the communication task. 

Note that the above two techniques can be combined to give the quantum Marton 
coding scheme with a common message |Takl2] . 
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Suppose that a source wishes to communicate with a remote destination and that a 
relay station is available which can decode the messages transmitted by the source 
during one time slot and forward them to the destination during the next time slot. 
With the relay's help, the source and the destination can improve communication rates 
because the destination can decode the intended messages in parallel from the channel 
outputs during two consecutive time slots. In this way, useful information is received 
both from the source and the relay. 

The discrete memoryless relay channel is a probabilistic 
model for a communication scenario with a source, a destina- 
tion and a cooperative relay station. The channel is modelled 
as a two-input two-output conditional probability distribution 



p{yi,y\x,xi), 



(7.1) 




Figure 7.1: The classi- 
cal relay channel. 



where x is the input of the source, yi and xi are the received 
symbol and transmitted symbol of the relay, and y is the out- 
put at the destination. This relay channel model is very gen- 
eral and contains many of the other ideas presented in this thesis. The transmission 
of the source towards the relay and the destination is a kind of broadcast channel, 
whereas the decoding at the destination is an instance of the multiple access channel. 
These correspondences can inform our choice of coding strategies, but in order to take 
full advantage of the communication network we must build a relay channel code which 
aims to achieve the best overall rate from the source to the destination. 
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7. J Introduction 

In this chapter, we will review some of the coding strategies for the classical relay 
channel and then show that the partial decode- and- forward strategy can be applied 
to the classical-quantum relay channel. Note that we depart from the usual naming 
conventions for senders and receivers. We do so because both the source and the relay 
act as senders in our scenario, so more specific identifiers are necessary. 

7.1 Introduction 

Consider two villages located in a valley that wish to establish a communication link 
between them using a direct link and also with the help of a radio tower on a nearby 
mountain peak. We can setup a relay station on the tower, which decodes the messages 
from the source village and retransmits them towards the destination village. Assuming 
the villagers only have access to point-to-point communication technologies, they now 
have two obvious options. Either they send information on the direct transmission 
link, or they use full relaying, where all their communication happens via the tower. 
In the first case, the tower is not used at all and in the second case the direct link is 
not used at all. 

It is worthwhile to examine the exact timing associated with the information flow 
in the latter scenario, since it is the flrst appearance of a multi-hop communication 
protocol. Let us assume that the source wants to send the string "constitution" to 
the destination. Assume that we use codewords of size n, and that each character is 
encoded in a separate codeword. The source and the relay have transmit codebooks 
{X;(a)}, {X^{a)}, a G ASCII. 

The direct transmission strategy will make 12n uses of the channel. The trans- 
missions of the source will be [X^(c), A'"(o), X"(n), . . . , A'^(n)] in each block. The 
relay will transmit a fixed codeword during this time. The destination will simply 
use a point-to-point decoder to extract the messages. The rate achievable using this 
strategy is given by: 

R< sup I{X;Y\Xi^xi). (7.2) 

p{x),xi 

The full relaying strategy will use the channel 13n times, where the need for an 
extra block of transmission is introduced by the decoding delay at the relay. During the 
13 blocks, the transmissions of the source will be [X"(c), X"(o), X"(n), . . . , X"(n), 0], 
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whereas the transmissions of the relay are one block behind: [0, X"(c), X^(o), . . . , 
X"(o), X"(n)]. The source simply has no more messages to send during block 13, 
whereas the relay has no information to forward during the first block, so both parties 
will stay silent during these different times. The rates that are achievable by this 
approach are: 

R< sup mm{I{Xi;Y),I{X;Yi\Xi)}. (7.3) 

p{x),p{xi) 

This corresponds to the minimum of the point-to-point capacities of the two legs of 
the transmission. Note that the second mutual information term is conditional on Xi, 
since the relay knows its own transmit signal. 

Surely a better strategy must exist than the ones described above. How can we 
use both the direct link and the relayed link at the same time? 

7.1.1 Classical relay channel coding strategies 

Two important families of coding strategies exist for relay channels: compress and 
forward and decode and forward |CEG79t lEGKlOj . 

In compress-and-forward strategies, the relay does not try to decode the message 
from his received signal Y-[^, but simply searches for a close sequence chosen from 
a predetermined compression codebook. To continue the example from the previous 
section, suppose that the relay's decoding simply tries to determine whether the trans- 
mitted message is a vowel or a consonant. This partial information about the message 
is then forwarded to the destination during the next block, encoded into a codeword 
Xi{s), s G {consonant, vowel} to serve as side-information for the decoding at the 
destination. 

Compress and forward strategies are appropriate in situations where the direct link 
between the source and the destination is stronger than the link from the source to 
the relay. In such a situation it would be disadvantageous to require that the messages 
from the source be fully decoded by the relay. Still, if the relay decodes something 
and forwards this information to the destination, better rates are achievable than if we 
simply chose to not use the relay as in the direct coding approach |EGK10j . 

In a decode-and-forward strategy, each of the transmitted messages is decoded by 
the relay and retransmitted during the next block. Using this strategy, the destination 
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can decode useful information both from the source and the relay. In this way we could 
achieve the maximum possible throughput to the destination J(X, Xi; Y). 

There are at least three decoding strategies that can be used by the destination: 
backwards decoding, sequential decoding with binning at the relay, or collective decod- 
ing of consecutive output blocks of the channel (joint decoding). All three decoding 
techniques for the decode-and-forward strategy achieve the same rate: 

R < max min{ I{X,Xi;Y), I{X;Yi\Xi)}. (7.4) 

p{x,xi) 

We will focus on the collective decoding strategy. 

To illustrate the collective decoding strategy let us consider again the situation in 
which the source village is transmitting the string "constitution" to the destination 



village. The transmission will take 13 block-uses of the channel. Figure [772] illustrates 
the flow of information for the character n which happens during the third and fourth 
block-uses of the channel. During the third and the fourth transmission blocks, the 
destination has collected the output variables {Y^^,Y^-^) and will perform a decoding 
operation on both outputs collectively. The rate I{X, Xi;Y) is obtained from the 
decomposition I{X, Xi, Y) = I{X; Y\Xi) + I{Xi; Y), where the second term will come 
from the probability of making a mistake when decoding x^^^ (n) from Y^^ and the first 
terms comes from the probability of wrongly decoding x"3^(n) from Y^y 



R R 




(a) During block 3, the relay will transmit its (b) During block 4, the relay will transmit 

codeword "o" , which we assume was received its codeword for "n" , which we assume was 

in the previous block. The source transmits received in the previous block. The source 

a codeword a;"(n|o) which is chosen from a transmits a codeword a;"(s|n). 
coherent codebook. 



Figure 7.2: Information flow in the relay network during the third and fourth transmission 
blocks of the string "constitution". 
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Observe that the optimization in (7.4) is taken over all joint input distributions 



Pxxii^jXi), which would seem to contradict the assumption that the source and the 
relay are different parties and cannot synchronize their encoding. Recall that in the 
multiple access channel problem, the assumption that the senders act independently 



translated to the optimization over all product distributions Pxi{xi)px2{^2) in (4.6). 



The change from Px{x)pxi{xi) to Pxxi{x,Xi) is allowed because the source uses 
a coherent codebook. The codewords for the relay are chosen according to pxi{xi), 
whereas the codewords for the sender are chosen according to px\Xi{x\xi) conditional 
on the codeword of the relay. But how could the source possibly know what the relay 
will be transmitting during each time instant? No telepathic abilities are necessary — 
only optimism. The source knows what the relay will be transmitting because, if the 
protocol is working, it should be the codeword from the previous block. 

The partial decode-and-forward strategy differs from the decode-and-forward strat- 
egy in that it requires the relay to decode only part of the message from the source 
[CEG79j . The idea is similar to the partial interference cancellation strategy used by 
Han and Kobayashi for the interference channel |HK81j . which is its contemporary. 



7.1.2 Quantum relay channels 



A classical-quantum relay channel A/" is a map with two 
classical inputs x and xi and two output quantum sys- 
tems Bi and B. For each pair of possible input symbols 
& X X Xi, the channel prepares a density operator 
defined on the tensor-product Hilbert space T-L^^ 



(7.5) 

where B\ is the relay output and B is the destination output. 




Rx 



Figure 7.3: The quan- 
tum relay channel p^x^- 



7.1.3 Chapter overview 

In this chapter we develop the partial decode-and-forward strategy for classical-quantum 
relay channels [SWV12j . This partial decoding at the relay is a more general strategy 



135 



1.2 Partial decode- and- forward strategy 



than the full decode-and-forward strategy in the same way that the partial interfer- 
ence cancellation strategy for the interference channel (the Han-Kobayashi strategy) 
was more general than a full interference cancellation strategy. 

Our results are the first extension of the quantum simultaneous decoding tech- 
niques used in |FHS"'"12[ ISenl2aj to multi-hop networks. The decoding is based on 
a novel "sliding-window" quantum measurement (see |Car82^ IXKOSj ) which involves 
a collective measurement on two consecutive blocks of the output in order to extract 
information from both the Sender and the relay. 

The next section will describe the coding strategy in more detail and state our 



results. The proof is given in Section 7.3 



7.2 Partial decode-and-forward strategy 

The idea for the code construction is to use a split codebook strategy where the source 
decomposes the message set into the Cartesian product of two different sets C and M.. 
We can think of the set C consisting of common messages that both the relay and 
the destination decode, while the set M. consists of personal messages that only the 
destination decodes. 

In the context of our coding strategy, we analyze the average probability of error 
at the relay: 



and the average probability of error at the destination: 

1 r / rjn on \ on on -i 



3 ''•J 



The operators (/ — F^^) and (/ — Km^/^^ correspond to the complements of the correct 
decoding outcomes. 

Definition 7.1. An {n,R,e) partial decode-and-forward code for the quantum relay 
channel consists of two codebooks {x^{mj, ij)}mjeM,ij<^c and {a^i (^j)}^^e£ and decoding 
POVMs {r^j}^.g£ (for the relay) and {^mj/j}^.^j^ ^^.^^ (for the destination), such that 
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the average probability of error is bounded from above as Pe = + ^ ^■ 

A rate R is achievable if there exists an (n, R — S,e) quantum relay channel code 
for all e,6 > and sufficiently large n. 

The theorem below captures the main result of this chapter. 
Theorem 7.1 (Partial decode-and- forward inner bound). Let {px,xi} be a cc-qq relay 



channel as in (7.5). Then a rate R is achievable, provided that the following inequality 
holds: 

R< max mm < >, 7.7 

- iiu,x,x,) I /(t/;fii|Xi)g + /(X;5|Xif/)g J 

where the information quantities are with respect to the classical- quantum state 

qUxx,b,b ^ p{u,x,xi)\u){u\^ ®\x){xf ®\xi){xif' ® p^'l. (7.8) 

X,U,X\ 

Our code construction employs codebooks {a;"}, {u"}, and {s"} generated accord- 
ing to the distribution p{xi)p{u\xi)p{x\u, xi). We split the message for each block into 
two parts (m, ^) G x £ such that we have R = Rm + Re- The relay fully decodes the 
message i and re-encodes it directly (without using binning) in the next block. The 
destination exploits a "sliding-window" decoding strategy |Car82t IXKOSj by perform- 
ing a collective measurement on two consecutive blocks. In this approach, the message 
pair {mj,ij) sent during block j is decoded from the outputs of blocks j and j + 1, 
using an "AND-measurement." 



7.3 Achievability proof 

We divide the channel uses into many blocks and build codes in a randomized, block- 
Markov manner within each block. The channel is used for b blocks, each indexed by 
j G {1, . . . , 6}. Our error analysis shows that: 

• The relay can decode the message ij during block j. 

• The destination can simultaneously decode {mj,£j) from a collective measure- 
ment on the output systems of blocks j and j + 1. 

The error analysis at the relay is similar to that of the Holevo-Schumacher- Westmoreland 
theorem |Hol98t ISW97j . The message ij can be decoded reliably if the rate Ri obeys 



137 



7.5 Achiev ability proof 



the following inequality: 

R,<I{U-B^\X{),. (7.9) 
The decoding at the destination is a variant of the quantum simultaneous decoder 



from Theorem 4.2 To decode the message (mj, the destination performs a "sliding- 
window" decoder, implemented as an "AND-measurement" on the outputs of blocks j 
and J + 1. This coding technique does not require binning at the relay or backwards 
decoding at the destination |Car82[ IXK05] . 

In this section, we give the details of the coding strategy and analyze the probability 
of error for the destination and the relay. 

Codebook construction. Fix a code distribution x, xi) = p{xi)p{u\xi)p{x\xi, u) 
and independently generate a different codebook for each block j as follows: 

• Randomly and independently generate 2"^*^ sequences a;"(£j_i), G [l : 2"-^*] , 

n 

according to Y\ p^Xu). 

i=l 

• For each randomly and independently generate 2"-^^ sequences ^j-i), 

n 

ij e [l : 2'"-^^] according to [1 P (^ikii(^j-i))- 

1=1 

• For each and each corresponding u"'{ij, ij^i), randomly and indepen- 
dently generate 2"^™ sequences x"'{mj, ij, ij^i), rrij G [l : 2""^'"], according to 

n 

the distribution: Yl p{xi\xii{^j-i) ,Ui{ij, ij^i) ). 



Transmission. The transmission of {mj,ij) to the destination happens during 



blocks j and j ' + 1 as illustrated in Figure 7.4 At the beginning of block j, we assume 



that the relay has correctly decoded the message ij-i- During block j, the source inputs 
the new messages rrij and ij, and the relay forwards the old message ij-i- That is, 
their inputs to the channel for block j are the codewords ) and x^ij-i), 

leading to the following state at the channel outputs: 

(i) — ^Uj)^U) 
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During block j +1, the source transmits (mj+i,£j+i) given £j, whereas the relay 
sends £j, leading to the state: 

Our shorthand notation is such that the states are identified by the messages that they 
encode, and the codewords are implicit. 



■^1(2)^1(2) (°) ■^l(3)2^r(3)(^) 



S x?2)(n,s,o) 



^(2) D 



S a;?3)(t,i,s) 



(a) During block 2, the relay will 
transmit its codeword x^2) (o) • We 
assume "o" was correctly decoded 
by the relay during the previous 
block. The source transmits a 
codeword a;"2)(ii, s, o). 



(b) During block 3, the re- 
lay will transmit its codeword 
a;^3^(s), which encodes the mes- 
sage £2 ="s" transmitted by the 
source during block 2. The 
source transmits the codeword 



"(3) 



(t,i,s) 



Figure 7.4: Information flow in the relay network during the second and third trans- 
mission blocks of the string "co ns ti tu ti on" when using the partial decode-and- 

forward strategy. The messages for each block (two characters) are encoded by the 
Sender using a codebook a;"(mj, £j, during block j. The messages pairs {mj,ij) 
for the seven uses of the channel are: {(c, o), (n, s), (t, i), (t, u), (t, i), (o, n), (0, 0)} The 
source codebook depends on the current message pair {mj,ij) as well as the mes- 
sage of the previous block, so the transmitted codewords during the seven blocks 
are: {x''^^^{c,o,(ll),x'^2^{n,s,o),x'^^^{t,±,s),x'^^^{t,u,±),x'^^^{t,i,u),x^^^^ and 
(0), a:^(2) (o), a;^(3) (s), x^(4)(i), x^(5)(u), x^(g) (i), (n)}. 



7.3.1 Decoding at the destination 

We now determine a decoding POVM that the destination can perform on the output 
systems spanning blocks j and j + 1. The destination is trying to recover messages ij 
and rrij given knowledge of ij-i- 

First let us consider forming decoding operators for block j + Consider the state 
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obtained by tracing over the systems X, f/, and Bi in (7.8): 



where = ^„a;P(a;|xi,'u)p(M|xi) pf^^. Also, let denote the following state: 
^^^p(xi)r^. Corresponding to the above states are conditionally typical projectors 
of the following form: 



DTT, 



n 



(i+i) 



which we combine to form the positive operator: 

pf"^^'^ ^ u^t'^ nl^-+i) n^^) 



(7.10) 



that acts on the output systems B?._^_^-. of block j + I. 



Let us now form decoding operators for block j. Define the conditional typical 



projector for the state p, 



as 



(7.11) 



The state obtained from (7.8) by tracing over X and Bi is 



U,Xl 



where p^^^ = J2xP(^\^^^ Pxxi- We can trace out over U as well to obtain the doubly 
averaged state pf^ = J2u,xP(^\^'^^^) P(^\^^) Px,xi- 

The following conditionally typical projectors will be used in the decoding: 



n 



(J) 



pi 



P«"(;j,ij_i),i^5'(!j_i).'5' 



P\lj-1 Px^(lj_-x)' 



5- 



We can then form a positive operator "sandwich" : 



p^U) ^ ttO') ttO') ttO') 



(7.12) 



Finally, we combine the positive operators from (7.10) and (7.12) to form the "sliding 
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window" positive operator: 

p^'fu""; = p^%- (7.13) 

from which we can build the destination's measurement A „ ,/ using the square-root 
normahzation. This measurement is what we call the "AND-measurement." 



Error analysis at the destination. In this section, we prove that the desti- 
nation can correctly decode the message pair {nij^lj) by employing the measurement 
{^m"i,IVi } on the output state P^^^i^i^_p P^i^^X+^e, spanning blocks j and j + 1. The 



average probability of error for the destination is given in (7.6). For now, we consider 
the error analysis for a single message pair {rrijjij): 



Tr 



on on 

_^ 0) 0+1) 



•p. 



< 2 Tr 



on on 



(i+i) 



f TDn on 

V Trip '^'^ o^^^ 6?)o^^^^^ > 



(I) 
(11) 



where we used the Hayashi-Nagaoka inequality (Lemma 3.1) to decompose the error 



operator {I—Aj^^Y\e^^l) i^^o ^"^o components: (I) a term related to the probability that 
the correct detector does not "click": (I — P ^^l li^^^'), and (II) another term related to 

the probability that a wrong detector "chcks": X](£'. m') -^m'*^£'.|l^^|'' i^'j'^'j) i.^j^'^j)- 
These two errors are analogous to the classical error events in which an output sequence 
is either not jointly typical with the transmitted codeword or happens to be jointly 
typical with another codeword. 



We will bound the expectation of the average probability of error Ec/^x^xj^jpf } by 
bounding the expectation of the average probability for the two error terms: E[/"X"X{'{ (I) } 
and E;7"X"xr{(II)}- 



The first term (I) is bounded by using the properties of typical projectors and the 



operator union bound from Lemma 5.1, which allows us to analyze the errors for the 

TTri rtn 

two blocks separately. Because < -P„ ^ < I and < P^, < /, we have: 



B" 
I-P^ 



< / 



+ / 



R" 
p (j+l) 

■ 



(7.14) 
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We use the definition of -P^^^^^i^^ / from ( |7.13p and tlie inequality ( |7.14p to obtain 



Tr 



T _ p^U)^U+i)\ (j) 



= Tr 

< Tr 
> — 



Si) 



Tr 

s 



+ 



+ Tr 



A3) 



Tr 



where we defined the error terms a and /3 associated with block j and block (j + 1). 
We proceed to bound the term (3 as follows: 



/3 = Tr 
= Tr 
= 1 - Tr 
< 1 - Tr 



+ 



where the inequality follows from Lemma |2.20[ We will analyze the terms labeled a 
and (3 separately. 

By taking the expectation over the code randomness, we obtain the upper bound: 



E = 1 - E Tr 



+ E 



^ U"X"\X^'^ 

(i+1) ^(i+i) 



< 1 - (1 - e) + 2^^. 



The inequality follows from E(7"X"|x;^|pm^\''£j+i | ~ '''^j ; properties of typical 
projectors: Exf Tr[nt^^^^'' r^J > 1 — e, Tr[n^''^"^'' r] > 1 — e and Lemma 

The error term a is bounded in a similar fashion. 



3.2 
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We can split the sum in the second error term (II) as follows: 



0+1) 



(A) 



(B) 



We now analyze the two terms (A) and (B) separately. 



Matching ij, wrong rrij. Assuming £j is decoded correctly, we show that the mes- 
sage ruj will be decoded correctly provided < -^(AT; = H{B\UXi) — 
H{B\UXXi) — S. We will use the following properties of typical projectors: 



jtU) ^r,n[H{B\UXXr)+5] U) 



ttO) AJ) Y\^j) ^r,-n[H{B\UXi)-S]TTij) 



(7.15) 
(7.16) 



Consider the first term: 



[pn pn 



(i+i) 



'm.jijlj-P'Pm^yYitj+itj 



- Tr[p^}^].|^._^ p. 



® 



TtU) TtO') Tt(j) ' TtO') TT(i) (j) 
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We now upper bound expression ® using (7.15) and take the conditional expectation 
with respect to X": 



S3) 



-{3) 



which is independent of the state £^ _i since m'- ^ rrij. The resulting expression 



in (D has the state p[^\._^ sandwiched between its typical projector on both sides, and 



so we can use (7.16). After these steps, we obtain the upper bound: 



E {(A)} < 2"[^(^l-^'^-^i)+'^] 2~"[-^(^l'^^i)"^]x 
X"\U"X:^ ~ 

X r \^Tr n^^'^ n^^'^ n^^') n^^'^ 



^ 2n[H(B\XUXi)+5]2-nlH{B\UXi)-5] ^ rj,^ 

< \M\ 2-"[^(^'^l'^^i)-2'5]_ 



rrij ,£j 



(7.17) 



The second inequality follows because each operator inside the trace is positive semidef- 
inite and less than or equal to the identity. 



Wrong £j (and thus wrong rrij). We obtain the requirement R = Re + Rm < 
I{XXi; B) = I{Xi; B) + I{UX; B\Xi) from the "AND-measurement" and the following 
inequahties: 

E Tr[n(^^+i)] < 2"[^(«l^i)+'^l, (7.18) 

n^^'+^) f n[^'+^) < 2-"[^^(s)-'5in(^-+i), (7.19) 

E Tr[n(^') 1 < 2"[H(B|c/xxi)+<5] ^^ 20) 

u^^ p\j^ nl^') < 2-"[^^(^i^i)-^]nl^-) . (7.21) 
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Consider the following term: 

[on on 
p^O)-"(i+i) ' 

5Z 



Tr 



(Bl) 



(B2) 



We want to calculate the expectation of the term (B) with respect to the code ran- 
domness Ei7"X"Xf ■ The random variables in different blocks are independent, and so 
we can analyze the expectations of the factors (Bl) and (B2) separately. 

Consider first the calculation in block j, which leads to the following bound on the 
expectation of the factor (Bl): 



E Tr 



r(i) 



r(i) 



J 3 ^ - 

ttO) (i) ttO) 



E Tr 



X 



E Tr 

XV- 



E mi] ni^') 4^ }x 



< 2-"[^^(s|^i)-'5] £Tr 



ttO) ttO) ttO) ttO) 



/'m'.,^'.|€^_i 



< 2-'*[^(^l^i)-'5] £ Tr 
_ 2-n[-'"{c^^;B|^i)-2<5] 
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The result of the expectation in (D is 



and we can bound the expression in 



using (7.21). The first inequahty follows because all the other terms in the trace are 
positive semidefinite operators less than or equal to the identity. The final inequality 



follows from (7.20). 



Now we consider the expectation of the second term: 



E {(B2)}= E Tr 



Tr 
Tr 



E Ip^"'^'^} E Ip^'^^^ ] 

U"X"X^ I ^'j ' U"X"X^ {' -^J+i'^j+i'^j ) 

E 



E Tr 

E Tr 

v^x^xi 



T T^l T 

3 

3 

^ 2-n[H(B)-5]2n[H(B\Xi)+6] _ 2-n[/(Xi;_B)-25] 



3 



< 2--n[H(B)-5\ ^ rj.^ 



Combining the upper bounds on (Bl) and (B2) gives our final upper bound: 



E {(B)} = E V (Bl)x(B2) 

< ^ 2-nlI{UX;B\Xi)-2S] ^ 2-"-l^(^T^'^)-'^^^ 

< \C\\M\ 2-"[^(^i;-^)+^(^^;-^i^i)-'^'5]. 



(7.22) 



By choosing the size of message sets to satisfy equations (7.17) and (7.22), the expec- 



tation of the average probability of error at the destination becomes arbitrarily small 
for n sufficiently large. 
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7.3.2 Decoding at the relay 



In this section we give the details of the POVM construction and the error analysis for 
the decoding at the relay. 



POVM Construction. During block j, the relay wants to decode the message ij 
encoded in u'^{ij,£j_i), given the knowledge of the message ij-i from the previous 



block. Consider the state obtained by tracing over the systems X and B in (7.8) 



u,xi 



where cr^^^ = ^^^^'(■^l-^i' '^^-B [pf xf ] • Further tracing over the system U leads to the 
state 

where = J^uPi'^l^^) '^uxi- Corresponding to the above conditional states are con- 
ditionally typical projectors of the following form 

The relay constructs a square- root measurement {Vi.\n._^} using the following positive 
operators: 

B^ 

p, li'^ = n= n n= . (7.23) 



Error analysis. In this section we show that during block j the relay will be able 
to decode the message from the state n^^^ \-, provided the rate Ri < 

I{U;Bi\Xi) = H{Bi\Xi) - H{Bi\UXi) - 6. The bound follows from the following 
properties of typical projectors: 

Tr[n 1 < 2"[^(^il^^i)+'^] (7.24) 
n an < 2~-mB.\x.)-s]j^ (7 25) 
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Recall that the average probability of error at the relay is given by: 



We consider the probability of error for a single message ij and begin by applying 



the Hayashi-Nagaoka operator inequality (Lemma 3.1 ) to split the error into two terms 



= Tr 
< 2Tr 



T _ p 10) 
(I) 



on 

^10) 
mj ,£j ,£j 

B 



+ 



DTI 



(11) 



We will bound the expectation of the average probability of error by bounding the 
individual terms. We bound the first term as follows: 



(I) = Tr 
= Tr 
= 1 - Tr 
< 1 - Tr 



ryn 



^10) 



)r>n 
/i(i) 



n- n- p 

^10) 



DTI 

^10) 



on 

^lU) 



^10) 



n n -I- TT- n TT- — o 



where the inequality follows from Lemma 2.20 



By taking the expectation over the code randomness we obtain the bound 



E (I) = 1 - E Tr 
+ E 

U"X"X^ 

= 1- E Tr 

U"XY 



TT- n ^^^^ TT- — n"^^^^ 



h E 



TT n TT - 

-2v^ 

< 1 - (1 -e) + 2Ve = e + 2Ve. 



< 1 - E Tr 
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The first inequality follows from Lemma 3.2 and the property 



E Tr 

The second inequality follows from: 



> 1 - e. 



(7.26) 



E Tr 



> 1 - e. 



(7.27) 



To bound the second term we proceed as follows: 

[on TDTi 



1 t>' 



E VTr E {PeT\ E {P^lv J 
E 5^Tr[ E jp^if 1 



1 "^^^^ L I 1 



The expectation can be broken up because 7^ £j and thus the t/"" codewords are 
independent. We have also used 



E 



r 



U) 



■i-i' 



(7.28) 



If ') 

We continue by expanding the operator P^,,! as follows: 



< 



= E VTr 

= E VTr 

E VTr 

U"X"X" ^-^ 
1 «" 



n n- ,n- 

^ \ 



^ 2-"[-f^(Sil^i)-<5] £ y^Tr 



U"X"X" 
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The first inequality follows from using (7.25 ) on the expression (D. The second inequality 



follows from the fact that 11^ is a positive semidefinite operator less than or equal 
to the identity. More precisely we have 



Tr 



= Tr 
< Tr 
= Tr 



r 



The penultimate inequality follows from (7.24) 



Thus if we choose Re < I{U; Bi\Xi) — 35, we can make the expectation of the 
average probability of error at the relay vanish in the limit of many uses of the channel. 



Proof conclusion. Note that the gentle operator lemma for ensembles is used 
several times in the proof. First, it is used to guarantee that the effect of acting with 
one of the projectors from the "measurement sandwich" does not disturb the state 
too much. Furthermore, because each of the output blocks is operated on twice: we 
depend on the gentle operator lemma to guarantee that the disturbance to the state 
during the first decoding stage is asymptotically negligible if the correct messages are 
decoded. 



7.4 Discussion 



In this chapter, we established the achievability of the rates given by the partial decode- 
and-forward strategy, thus extending the study of classical-quantum channels to multi- 
hop scenarios. 

The new techniques from this chapter are the use of the coherent codebooks and the 
AND-measurement, which collectively decodes messages from two blocks of the output 
of the channel. 
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We obtain the decoding-and- forward inner bound as a corollary of Theorem 7.1 

Corollary 7.1 (Decode-and- forward strategy for quantum relay channel). The rates 
R satisfying 

R < max min{ I{X,Xi;B)e, I{X;Bi\Xi)e} (7.29) 

p{x,xi) 

where the mutual information quantities are taken with respect to the state 

qXX,b,b ^ Y,Px\xAAxi)PxAxi) ® ® p%1. (7.30) 

^ V 

X,Xl 



are achievable for quantum relay channels by setting X = U in Theorem 7.1 



Note also that setting the xi to a fixed input in Theorem 7.1 would give us a 



quantum direct coding inner bound similar to the one from equation (7.2) 



An interesting open question is to determine a compress-and-forward strategy for 
the quantum setting. This could possibly involve combining results from quantum 
source coding and quantum channel coding |DHWll[ IWS12j . 

Another avenue for research would be to consider quantum communication and en- 
tanglement distillation scenarios on a quantum relay network. Further research in this 
area would have applications for the design of quantum repeaters |CGDR05l IDutllb] . 
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Bosonic interference channels 



Optical communication links form the backbone of the information superhighway which 
is the Internet. A single optical fiber can carry hundreds of gigabits of data per sec- 
ond over long distances thanks to the excellent light-transmission properties of glass 
materials. Free-space optical communication is also possible at rates of hundreds of 
megabits per second |TNO02j . 

An optical communication system consists of a modulated source of photons, the 
optical channel (or more generally the bosonic channel, since photons are bosons), and 



an optical detector. Figure 4.2 on page 42 illustrates an example of such a communi- 
cation system. 

As information theorists, we are interested in determining the ultimate limits on 
the rates for communication over such channels. For each possible combination of the 
optical encoding and optical decoding strategies, we obtain a different communication 
model for which we can calculate the capacity. More generally, we are interested in 
the ultimate capacity of the bosonic channel as permitted by the laws of physics. For 
this purpose we must optimize over all possible encoding and decoding strategies, both 
practical and theoretical. 

In this chapter we present a quantum treatment of a free-space optical interfer- 
ence channel. We consider the performance of laser-light encoding (coherent light) in 
conjunction with three detection strategies: (1) homodyne, (2) heterodyne, and (3) 



joint detection. In Section 8.1 , we will introduce some basic notions of quantum optics 



which are required for the remainder of the chapter. In Section 8.2 we will discuss 
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previous results on bosonic quantum channels and describe the known capacity formu- 
las for point-to-point free-space bosonic channels for the three detection strategies. In 



Section [873] we define the bosonic interference channel model and calculate the capacity 
region for the special cases of "strong" and "very strong" interference for each detection 
strategy. We also establish the Han-Kobayashi achievable rate regions for homodyne, 
heterodyne and joint detection. 



8.1 Preliminaries 
8.1.1 Gaussian channels 

We begin by introducing some notation. Define the real-valued Gaussian probability 
density function with mean /i and variance o"^ as follows: 

AfR{x-fi,a^) = —L=e^^ e V{R). (8.1) 
Define also the circularly symmetric complex- valued Gaussian distribution 

where we identify z = x + iy and assume that the variance parameter is real- valued 
(7^ G R. Note also that in the complex-valued case, the quantity represents the 
variance per real dimension] a variable Z ~ A/'c(/i, o"^) will have variance Var{Z} = 
E^[|Z-^|2] = 2a2. 

The additive white Gaussian noise (AWGN) channel is a communication model 
where the input and output are continuous random variables and the noise is Gaussian. 
Let X be the random variable associated with the input of the channel. Then the 
output variable Y will be: 

Y = X + Z, (8.3) 

where Z ~ A/k(0, A^) is a Gaussian random variable with zero-mean and variance N. 
As in the discrete memoryless case, we can use a codebook {a;"'(m)}, m G [1 : 2'^-^], 
with codewords generated randomly and independently according to a probability den- 
sity function n"P^(^)- Furthermore we impose an average power constraint on the 
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codebook: 




< P- (8.4) 



The channel capacity is calculated using the differential entropy, h : V{R) — )■ R, 
which plays the role of the Shannon entropy for continuous random variables. We know 



from Shannon's channel capacity theorem (Theorem 3.1) that a rate R is achievable 
provided it is less than the mutual information of the joint probability distribution 
induced by the input distribution and the channel: {X,Y) ~ PxPy\x- For any choice 
of input distribution px, the following rate is achievable: 

R < I{X; Y) = h{Y) - h(Y\X) 

= h{Y) - h{X + Z\X) 
= h{Y) - h{Z\X) 

= h(Y)-h{Z). (8.5) 

The last equality follows because the noise Z is assumed to be independent of the input 
X. It can be shown that a Gaussian distribution with variance P is the optimal choice 
of input distribution |CT91j . Furthermore, when we choose X ~ A/'r(0, P) it is possible 
to compute the above expression exactly and obtain the capacity: 

C = \\ogJl + ^\ [bits/use]. (8.6) 



2 V A^^ 

We will refer to the ratio P/N as the signal to noise ratio. We sometimes abbreviate 
this expression as: 7(SNR) = ^ loga (1 + SNR). The above formula is one of the great 
successes of classical information theory. 



The Gaussian multiple access channel is defined as: 

Y = v^Xi + v^X2 + Z, (8.7) 

where a, /3 G R are the gain coefficients and Z ~ A/r(0, A^) is an additive Gaussian noise 
term with average power N. When input power constraints Ex^ { ^ ^"=1 -^u } — -^i 
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and Exy Yl^=i ^ii} — -^2 are imposed, the capacity region is given by: 

R, <I{X,;Y\X2) = l\og,{l + ^) 1 
i?2 < /(X2;F|Xi) = ilog2(l + ^) y 
R1 + R2 <I{X,X^-Y) =llog2(l + ^^Mi)J 

Each of the constraints on the capacity region has an intuitive interpretation in terms 
of signal to noise ratios. In this context, we also have the expression I{Xi]Y) = 
\ log2 ^1 + jv+^Pa ) ' which the unknown codewords of the second transmitter are 
treated as contributing to the noise. 



Cmac = { {Ri, R2) £ IR 



2 



8.1.2 Introduction to quantum optics 

Photons are excitations of the electromagnetic field. We say that photons are bosons 
because they obey Bose-Einstein statistics: they are indistinguishable particles that 
are symmetric under exchang^ Multiple bosons with the same energy can occupy the 
same quantum state. This is in contrast with fermions which obey Pauli's exclusion 
principle. Bosonic channels are channels in which the inputs and the outputs are 
bosons. 

In this section, we will introduce some background material on quantum optics 
which is needed for the rest of the presentation in this chapter. Recall that the states 
of quantum systems are described by density operators <J,p E Vil-L), where "H is a 
Hilbert space. Unitary quantum operations act by conjugation, so that by applying U 
to a we obtain p = UaW as output. The expectation value of some operator A when 
the system is in the state p is denoted {A) = Tr[y4p]. 

Let po = |0X0| be the vacuum state of one mode of the electromagnetic field. We 
define to be the creation operator for that mode. Applying d^ to the vacuum state 
we obtain the first excited state: 

|1)(1| = a^|0)(0|a, (8.8) 

and this process can be iterated to create further excitations in the field. The Hermi- 
tian conjugate of the creation operator is the annihilation operator which takes away 

^ The wave function describing two photons pi and p2 is even under exchange of the two particles: 
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excitations from the field. More generally, we have 

a\n) ^ ^/n\n-l), (8.9) 
= + 1 + (8.10) 

(8.11) 

The state space |0)(0|, |1)(1|, |2)(2|, |3)(3|, ... is known as Fock space and it is infinite 
dimensional. The creation and annihilation operators obey the commutation relation 
[a, at] = 1. 

The real part and the imaginary part of the operator d are defined as the x quadra- 
ture and the p quadrature: 

^ = P=^, (8.12) 

and we have -P] = i- 

If we want to measure how many excitations are in the field, we use the number 
operator N — d^a. If the field is in excitation level n, the expected number of excitations 
will be: 

(N) = Tr [ata|n)(n|] = n. (8.13) 



The Hamiltonian that describes one non-interacting mode of the electromagnetic 
field is given by: 

^ = /ju; ^a^a + . (8.14) 

The Hamiltonian is important because it gives the time evolution operator U (t) = e*^* 
and the energy of the system: Ep = (H) — Tr[ifp]. Observe that the system has 
energy even when it is in the vacuum state: 

Eo = IV[^|0)(0|] = (0|^|0) = hw{0\ (^ata + ^ |0) - ^. (8.15) 

This is known as the zero-point energy or vacuum energy. 
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8.1.3 Coherent states 

A composite system exhibits coherence if all its components somehow coincide with 
each other. This could be either coincidence in time, space coherence, phase coherence 
or quantum coherence. An example of the latter is the process of stimulated emission 
of photons which occurs inside a laser. All new photons are created exactly "in phase" 
with the other photons inside the laser. Over time the number of photons in the laser 
will grow, but they will all have the same frequency, phase and polarization. 

The coherent state |a) describes an oscillation of the electromagnetic field. In 
general a E C and we have a = \a\e^'^, where \a\ is the amplitude of the oscillation 
and (p is the initial phase. In the Fock basis, the coherent state |a) is written as: 

l«) = e"'^E:7^l^) (8.16) 

(8.17) 



ra=0 ^ ^' 



e 2 



|0) + |a|e''^|l) + ^e2^^|2) + ^e=^^<^|3) + 



The output of a laser is coherent light: the excitations at all energy levels will have 
the same phase. Coherent states remain coherent over time: |a(t)) = U{t)\a) = 

A coherent state can also be defined in terms of the unitary displacement operator 
which acts as: 

|a) = D{a)\0) =exp{aa^ -a* d)\0). (8.18) 

Note that in some respect D{a) is similar to the creation operator a\ since it creates 
excited states from the vacuum state. 



8.2 Bosonic channels 

Point-to-point optical communication using laser-light modulation in conjunction with 
direct-detection and coherent-detection receivers has been studied in detail using the 
semiclassical theory of photodetection |GK95j . This approach treats light as a classical 
electromagnetic field, and the fundamental noise encountered in photodetection is the 
shot noise associated with the discreteness of the electron charge. 
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These semiclassical treatments for systems that exploit classical-hght modulation 
and conventional receivers (direct, homodyne, or heterodyne) have had some success, 
but we should recall that electromagnetic waves are quantized, and the correct assess- 
ment of systems that use non-classical light sources and/or general optical measure- 
ments requires a full quantum-mechanical framework |Sha09] . There are several recent 
theoretical studies on the point-to-point lGGL+041 IGuhllj . broadcast jGSEOTj and 



multiple-access |Yen05aj bosonic channels. These studies have shown that quantum 
communication rates (Holevo rates) surpass what can be obtained with conventional 
receivers. For the general quantum channel, attaining Holevo information rates may 
require collective measurements (a joint detection) across all the output systems of the 
channel. 

Before stating our results on the bosonic interference channel, we will briefly review 
some results on point-to-point bosonic channels in the next subsection. 



8.2.1 Channel model 

The free-space optical communication channel is a physically realistic model for the 
propagation of photons from transmitter to receiver. We assume that a transmitter 
aperture of size At is placed at a distance L from a receiver aperture of size Ar, and 
that we are using A-wavelength laser light for the transmission. 



At 




Figure 8.1: The free-space optical communication channel. Two apertures of area At and 
Ar are placed L distance apart. The channel decomposes into different modes of propagation. 
We model the channel as a transformation from an annihilation operator on the transmit side 
to an annihilation operator at the receiver side. 
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To analyze the communication capacity of the bosonic channel, we can decompose 
the problem into finding the capacity for each of the spatial modes of propagation, 
which will in general have different transmissivity coefficients 1]. In the far-field prop- 
agation regime, which is when we have AtAr/lXL)"^ ^ 1, only two orthogonal spatial 
modes (one for each polarization degree of freedom) will have significant power trans- 
missivity. We will analyze the channel for a single mode (one choice of polarization). 

The channel input is an electromagnetic field mode with annihilation operator a, 
and the channel output is another mode with annihilation operator b. The channel 
map is described by: 

b = + v^l - ^ i>, (8.19) 

in which z> is associated with the noise of the environment and the parameter rj, < 
?7 < 1, models the channel transmissivity. 

We say that a channel is pure-loss if the environmental noise is in the vacuum 
state |0)(0|. A channel has thermal noise if the mode is in the thermal state: 

which is Gaussian mixture of coherent states with average photon number Nb > 0. 
One can also write the thermal state in the number basis as follows: 

Pt = > —] n){n. 8.21 

n=0 ' 



8.2.2 Encoding 

We will use coherent state encoding of the information at the transmitter. The code- 
book consists of tensor products of vacuum states displaced randomly and indepen- 
dently by an amount drawn from a distribution : 



Y[pa \aia2---an) =D{ai)\0)®D{a2)\0)®---®D{an)\0). 



This encoding strategy is chosen because it is simple to implement in practice, and 
also because it is known that it suffices to achieve the ultimate capacity of the bosonic 
channel |GGL+04j . 
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When homodyne detection will be used at the receiver, we will encode the infor- 
mation using only the x quadrature. The displacements are chosen according to: 

a ~ A/'r (0, iVs) ■ (8.22) 

The distribution is chosen so that it satisfies the constraint on the average number 
of input photons < Ns, which is the quantum analogue of the input power 

constraint for the AWGN channel. 

For heterodyne and joint detection, we will use both quadratures and choose the 
displacements according to a circularly-symmetric complex-valued Gaussian distribu- 
tion: 

a^Afc (0, Ns/2) . (8.23) 



8.2.3 Homodyne detection 



Homodyne detection consists of combining on a beamsplitter the incoming light and 
a local oscillator signal and measuring the resulting difference of the intensities. By 
tuning the relative phase between the incoming signal and the local oscillator it is 
possible to measure the incoming photons in any quadrature. 



When coherent state encoding is used with displacement values chosen as in (8.22) 
and homodyne detection is used, the resulting channel is Gaussian: 



Y = ^a + Z, 



horn; 



where Zhom ~ A/r (0, (2(1 — ri)NB + 1) /4). The "+1" term in the noise variance arises 
physically from the zero-point fiuctuations of the vacuum. 

We can now use the general formula for the capacity of the AWGN channel from 



6 ) to obtain the capacity with homodyne detection: 
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8.2.4 Heterodyne detection 

The heterodyne detection strategy attempts to measure the incoming hght in both 
quadratures. The sender inputs a coherent state |a) with a E C. Heterodyne detection 
of the channel output results in a classical complex Gaussian channel, where the receiver 
output is a complex random variable Y described by: 

F = y^a + Zhet, (8.25) 

where Zhct ~ Ac (0, ((1 — ri)NB + l)/2). The capacity formula for this choice of detec- 
tion strategy is given by: 

Chet = logfl + - "^^^ , ^ bits/use. (8.26) 

The factor of 1/2 in the noise variances is due to the attempt to measure both quadra- 
tures of the field simultaneously |Sha09] . 



8.2.5 Joint detection 



The capacity of the single-mode lossy bosonic channel with thermal background noise 
is thought to be equal to the channel's Holevo information: 



X = g{r]Ns + {1 - v) Nb) - g{{l - v) Nb) bits/use, 



(8.27) 



where Ns and Nb are the mean photon numbers per mode for the input signal and the 
thermal noise, and g{N) = {N + 1) log (A^ + 1) — A^log (A^) is the entropy of a thermal 



state with mean photon number N. The latter formula is easily obtained from (8.21) 



Hpt) = -Tr[ptlogpt] 

V — Vi (—(— 

^ N + i[n + lj ^^[n+1\N+1 

n=0 \ / \ \ 



^ N + 1 

n=0 



+ 1 VA^ + l 



-nhgN + (n + l) \og{N + 1) 



(A^ + 1) log(A^ + 1) - A^log = g{N). 
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This capacity formula from equation (8.27) assumes a long-standing conjecture regard- 



ing the minimum-output entropy of the thermal noise channel |GGL+04[ [UHLMlOj . 
It is known that joint-detection (collective) measurements over long codeword 



blocks are necessary to achieve the rates in equation (8.27) for both the pure-loss 



and the thermal-noise lossy bosonic channel |Guhllt IWGTL12] . Note, however, that 
quantum states of light are not necessary to achieve the rate x; coherent-state encoding 
is sufficient. 




- Homodyne detection 
Heterodyne detection 

- Joint detection (Holevo rate) 



10* 



10 10 10 

N [average photon number] 



10' 



Figure 8.2: The achievable rates for the different decoding strategies: homodyne, hetero- 
dyne and joint detection in the low photon number regime 0.01 < (|ap) = Ns < 100. The 
channel has r] = 0.9 and Nb = 1. The joint detection strategy outperforms the classical 



strategies in which the outputs of the channel are measured individually, cf. Figure 3.5 



The rates achievable by the three different detection strategies are illustrate in Fig- 



ure 



8.2, and on this we conclude our review of point-to-point bosonic communication. 



In the next section, we consider the bosonic interference channel with thermal-noise, 
particularly in the context of free-space terrestrial optical communications. 



8.3 Free-space optical interference channels 



Consider now a scenario similar to the one described in Figure 8.1, but now assume 
that there are two senders and two receivers. Sender 1 modulates her information on 
the first spatial mode of the transmitter-pupil, and Receiver 1 separates and demod- 
ulates information from the corresponding receiver-pupil spatial mode. With perfect 
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spatial-mode control at the transmitter and perfect mode separation at the receiver, the 
orthogonal spatial modes can be thought of as independent parallel channels with no 
crosstalk. However, imperfect (slightly non-orthogonal) mode generation or imperfect 
mode separation can result in crosstalk (interference) between the different channels. 

We will model the bosonic interference channel as a passive linear mixing of the 
input modes along with a thermal environment adding zero-mean, isotropic Gaussian 
noise. The channel model is given by: 

^1 = Vviiai + ^/r]^a2 + y/fjih, (8.28) 
h = y/rjucii - y/r]^a2 + V^uh, (8.29) 

where r]u,r]i2,V2i,m2,Vi,f]2 G [R+, y/vIiVu = m = 1 - r]u - V21, and r/2 = 

1 — V12 — V22- The following conditions ensure that the network is passive: 

^?11+'712<1, ^?11+'721<1, ^?22+^?21<l, '722 + ^?12<l- 

We constrain the mean photon number of the transmitters di and 0,2 to be Ns-^ and Ns^ 
photons per mode, respectively. The environment modes Oi and D2 are in statistically 
independent zero-mean thermal states with respective mean photon numbers Nb^ and 
per mode |Sha09] . 



8.3.1 Detection strategies 

For a coherent state encoding and coherenlj^ detection at both receivers, the above 
model is a special case of the Gaussian interference channel, and we can study its 
capacity regions in various settings by applying the known classical results from |Car75 
[SatST] and |HK81j . 

If the senders prepare their inputs in coherent states and \a2), with ai, 02 ^ 
and both receivers perform x-quadrature homodyne detection on their respective modes, 
the result is a classical Gaussian interference channel |Sha09j . where Receivers 1 and 

^We refer to both homodyne and heterodyne strategies as coherent strategies. 
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2 obtain respective conditional Gaussian random variables Yi and Y2 distributed as 

Yi ~ A/'r (v%r«i + v^"2, {2f]iNB, + 1) /4) , 
Y2 ~ ATr (v/^?i2a2 + aA22"i5 C^mNB:, + 1) /4) , 

where the term in the noise variances arises physically from the zero-point fluctu- 
ations of the vacuum. Suppose that the senders again encode their signals as coherent 
states and |a2), but this time with ai, 02 G C, and that the receivers both perform 
heterodyne detection. This results in a classical complex Gaussian interference chan- 
nel |Sha09] . where Receivers 1 and 2 detect respective conditional complex Gaussian 
random variables Zi and Z2, whose real parts are distributed as 

Re {Zm} ~ ^fR (yu™, iVmNB^ + l)/2) , (8.30) 

where m G {1, 2}, /ii = y^Re {ai}+y/r]^Re {02}, 1^2 = a/^?i2 Re {al}+^/m2Re {^2}, 
and the imaginary parts of Zi and Z2 are distributed with the same variance as 
their real parts, and their respective means are ^r^n Im {ai} + Ini {0:2} and 
^yfj^lm{ai} + y/r]^lm{a2}- The factor of 1/2 in the noise variances is due to the 
attempt to measure both quadratures of the field simultaneously |Sha09] . 



8.4 Very strong interference case 



Recall the setting of the interference channel which we discussed in Section 5.2.1 , where 
the crosstalk between the communication links is so strong that the receivers can fully 
decode the interfering signal and "subtract" it from the received signal to completely 



cancel its effects. The conditions in (5.5) and (5.6) translate to the following ones for 



the case of coherent-state encoding and coherent detection: 



r]2i ^ ^VuNs, + + 1 



V22 



2^fj2N, 



B2 



rn2 ^ ^'V22Ns, + 2'V2Nb, + 1 
r]n - 2%Nb, + 1 
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and the capacity region becomes 



1 / 4^VuNs, 



i?2 < 4 log ( 1 + 



where z = 1 for homodyne detection and i = for heterodyne detection. 



(8.31) 
(8.32) 



Low Power 



0.08 



0.06 



0.04 



0.02 



High Power 




Figure 8.3: Capacity regions for coherent-state encodings and coherent detection, and 
achievable rate regions for coherent-state encodings and joint detection receivers — both with 
Vii = ^22 = 1/16 and r/12 = ??2i = 1/2 ("very strong" interference for coherent detection). The 
LHS displays these regions in a low-power regime with Ns^ = = 1 and A'^^i = -^B2 ~ 
1, and the RHS displays these regions in a high-power regime where Ns-^ = = 100. 
Homodyne detection outperforms heterodyne detection in the low-power regime because it 
has a reduced detection noise, while heterodyne detection outperforms homodyne detection 
in the high-power regime because its has an increased bandwidth. 



We can also consider the case when the senders employ coherent-state encodings 
and the receivers employ a joint detection strategy on all of their respective channel 



outputs. The conditions in (5.5) and (5.6) readily translate to this quantum setting 



where we now consider Bi and B2 to be quantum systems, and the information quan- 



tities in (5.5) and (5.6) become Holevo informations. The conditions in (5.5) and (5.6) 



when restricted to coherent-state encodings translate to: 

9{V22Ns2 + V2NB2) - gimNs^) < g{ri2iNs2 + ^711^^51 + ^liVfii) - givuNsi + ViNb^) , 
giVuNs:, + ViNb^) - g{f]iNBi) < g{rii2Ns^ + r?22iV'52 + V2NB2) - giV22Ns2 + V2NB2) ■ 



where g{N) = {N + 1) log (A^ + 1) — A^log (A^) is the entropy of a thermal state with 
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mean photon number A^. 

An achievable rate region is then 

Ri < givuNs, + viNb,) - giviNB,) , 

R2 < 9iV22Ns2 + V2NB2) - 9iV2NB2) ■ 

These rates are achievable using a coherent-state encoding, but are not necessarily 
optimal (though they would be optimal if the minimum-output entropy conjecture 
from Refs. |GGL+04l IGHLMlOj were true). Nevertheless, these rates always beat the 



rates from homodyne and heterodyne detection. Figure 8^ shows examples of the 
achievable rate regions for a bosonic interference channel with very strong interference. 
Both the low-power and high-power regimes are considered. Observe that the relative 
superiority of homodyne and heterodyne detection depend on power constraint and 
that the joint detection strategy always outperforms them. 



8.5 Strong interference case 



Sato jSatSlj determined the capacity of the classical Gaussian interference channel 
under "strong" interference. Theorem 52 from Chapter [5] gives us the capacity region 
for quantum interference channels with strong interference. We will now apply these 
results in the context of the bosonic interference channel. 



The conditions for a channel to exhibit "strong" interference are given in equations 



(5.16) and (5.17), and they translate to the following ones for coherent-state encoding 



and coherent detection: 



r]2i ^ TviNb, + 1 
V22 ~ 2%Nb, + 1 ' 



rju ^ 2'V2Nb, + 1 
r]n ~ 2%NB, + r 
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and the capacity region becomes: 



1 , / 4:'r]uNs, 
i?i < - log 1 + . '^^ 

Ko < — log 1 H — : 

1 f log fi I 4i^"^^i+''^i^g2 
i?i + i?2 < ^ mm , ^ >>..ivs.+m.ivs. 



(8.33) 
(8.34) 

(8.35) 



where again i = 1 for homodyne detection and i = for heterodyne detection. 



Strong interference 




Cd 2 



0.5 1 1.5 2 2.5 3 3.5 4 

Ri 



Figure 8.4: The figure depicts the "strong" interference capacity regions in the high-power 
regime for homodyne and heterodyne detection, and joint detection. The channel in the 
figure is in the high-power regime: A''^^ = = 1, ^ii = i]22 = 0.3, 7721 = 1I12 = 0.6, and 
A'^S'i = -^52 — 100. Heterodyne detection outperforms homodyne detection in this case. 



We can also compute the achievable rate region using the joint detection strategy. 
Figure 8.4| displays the different capacity and achievable rate regions when a free-space 
interference channel exhibits "strong" interference. 



8.6 Han-Kobayashi rate regions 



The Han-Kobayashi rate region is the largest known achievable rate region for the 



classical interference channel |HK81j . The region was described in Theorem 5.3, and 
in Section 5^ we established the achievability of the Chong-Motani-Garg, which is 
equivalent to the Han-Kobayashi rate region. 
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The Han-Kobayashi coding strategy readily translates into a strategy for coherent- 
state encoding and coherent detection. Sender m shares the total photon number A^^^ 
between her personal message and her common message. Let Am be the fraction of 
signal power that Sender m devotes to her personal message, and let Am = (1 — Am) 
denote the remaining fraction of the signal power that Sender m devotes to her common 
message. 

When Receiver 1 uses homodyne detection to decode the messages, we can identify 
the following components that are part of his received signal: 



Ai?7iiA^5i = 


power of own personal message. 


(8.36) 


Ai77iiA^5i = 


power of own common message. 


(8.37) 


ViiNsi = 


total own signal power. 


(8.38) 


V2iNs2 = 


total interference power. 


(8.39) 


\2ri21Ns2 = 


useful part of interference (other's common). 


(8.40) 




non-useful interference (other's personal). 


(8.41) 


m = 


^ {2fjiNB^ -|- 1) = noise power. 


(8.42) 



Similar expressions exist for Receiver 2. 



Consider now the inequalities (HK1)-(HK9) which define the Han-Kobayashi rate 
region (see page 87). When we evaluate each of the mutual informations for the signal 
and noise quantities (8.36) - (8.42), we obtain the Han-Kobayashi achievable rate region 
for the bosonic interference channel: 



^i<7(^-^^#^l (BHKl) 

- WV^^TiV^J ^\x,,,2Ns.+N2) 
^2<7( . J^'f'^ .A (BHK3) 



Al??12A^5i + N2 

' - ^\\m2Ns,+N2) ^^\\2miNs,+Nr) 

^ mNs.+X2miNs. \ ^ J ^-^^^^s. \ (BHK5) 
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" V X2ri2iNs, + Ni J 'V Xirii2Ns,+N2 J ^ ' 



X2miNs2 +Ni J '\ X2miNs2 + N: 



V Air/i2iV5i + ^^^2 / \Xi 



V Xir^i2Ns,+N2 ) 
(BHK8) 



X2'n22Ns2 \ , / XiriuNs^ + X2r]2iNs2 

+ 7[ 



7?i2iV5i +N2J 'V A2r/2iiVs2 + A^i 

(BHK9) 



Note the shorthand notation used 7(0;) = Mog2(l + x). 



Han-Kobayashi rate region for different detectors 




Figure 8.5: The figure depicts the achievable rate regions by employing a Han-Kobayashi 
coding strategy for homodyne and heterodyne detection. The channel parameters are Ns-^ = 
Ns^ = 100, Nb^ = Nb2 = 1) = ^22 = 0.8, and r/21 = r/12 = 0.1. All of these regions are 
with respect to a 10%-personal, 90%-common Han-Kobayashi power split. 



We can also calculate the shape of the Han-Kobayashi achievable rate region if the 
senders employ coherent-state encodings and the receivers exploit heterodyne or joint 
detection receivers. A statement of the inequalities for the other detection strategies 



has been omitted, because they are similar to (BHK1)-(BHK9). Figure 8.5 shows the 
relative sizes of the Han-Kobayashi rate regions achievable with coherent detection and 
joint detector for a particular choice of input power split: = 0.1, A^ = 0.9. 
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8.7 Discussion 

The semiclassical models for free-space optical communication are not sufficient to 
understand the ultimate limits on reliable communication rates, for both point-to- 
point and multiuser bosonic channels. We presented a quantum-mechanical model 
for the free-space optical interference channel and determined achievable rate regions 
using three different decoding strategies for the receivers. We also determined the 
Han-Kobayashi inner bound for homodyne, heterodyne and joint detection. 

Several open problems remain for this line of inquiry. We do not know if a coherent- 
state encoding is in fact optimal for the free-space interference channel — it might 
be that squeezed state transmitters could achieve higher communication rates as in 
[YenOSa] . One could also evaluate the ergodic and outage capacity regions based on 
the statistics of rjij, which could be derived from the spatial coherence functions of the 
stochastic mode patterns under atmospheric turbulence. 
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Conclusion 



The time has come to conclude our inquiry into the problems of quantum network 
information theory. We will use this last chapter to summarize our results and highlight 
the specific contribution of this thesis. We will also discuss open problems and avenues 
for future research. 



9.1 Summary 

The present work demonstrates clearly that many of the problems of classical network 
information theory can be extended to the study of classical-quantum channels. Orig- 
inally, we set out to investigate the network information theory problems discussed in 
|EGC80j . It is fair to say that we have been successful on that front, since we man- 
aged to develop coding strategies for multiple access channels (Chapter |4]), interference 
channels (Chapter [s]), broadcast channels (Chapter [6]) and relay channels (Chapter [?]), 
in the classical-quantum setting. 

Our proof techniques are a mix of classical and quantum ideas. On the classical side 
we have the standard tools of information theory like averaging, conditional averaging 
and the use of the properties of typical sets. On the quantum side we saw how to build 
a projector sandwich, which contains many layers of conditionally typical projectors, 
how to incorporate state smoothing, which cuts out non-typical eigenvalues of a state, 
and the winning combination of the square root measurement and the Hayashi-Nagaoka 
operator inequality. 
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9.2 New results 

Above all, it is the quantum conditionally typical projectors that played the biggest 
role in all our results. Conditionally typical projectors are truly amazing constructs, 
since they not only give us a basis in terms of which to analyze the quantum outputs, 
but also tell us exactly in which subspace we are likely to find the output states on 
average. 

9.2 New results 

Some of the results presented in this thesis have previously appeared in publications 
and some are original to this thesis. We will use this section to highlight the new 
results. 

The first contribution is the establishment of the classical/quantum packing lem- 
mas using conditionally typical sets/projectors. While these packing lemmas are not 
new in themselves, the proofs presented highlight the correspondences between the 
indicator functions for the classical conditionally typical sets and, their quantum coun- 
terparts, the conditionally typical projectors. The quantum packing lemma is an effort 
to abstract away the details of the quantum decoding strategy into a reusable compo- 
nent as is done in |EGK10j . 

It is the author's hope that the classical and quantum packing lemmas presented 
in this work, along with their proofs, can serve as a bridge for classical information 
theorists to cross over to the quantum side. Alternately, we can say that there is only 
one side and interpret the move from classical Shannon theory to quantum Shannon 
theory as a type of system upgrade. Indeed, the change from indicator functions for the 
conditionally typical sets to conditionally typical projectors can be seen in terms of the 
OSI layered model for network architectures: quantum coding techniques are a change 
in physical layer (Layer 1) protocols while the random coding approach of the data 
link layer (Layer 2) stays the same. Note that this analogy only works for the classical 
communication problem, and that quantum communication and entanglement- assisted 
communication are completely new problems in quantum Shannon theory, which have 
no direct classical analogues. 

The main original contribution of this thesis is the achievability proof for the 
quantum Chong-Motani-Garg rate region, which requires only two-sender simultane- 
ous decoding. By the equivalence "R-iiKi-^) = ^cmg(A/'), we have established the 
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achievability of the quantum Han-Kobayashi rate region. We can therefore close the 
book on the original research question which prompted our investigation more than 
two years ago. 



An interesting open problem is to prove Conjecture on the simultaneous de- 
coding for the three-sender quantum multiple access channels. This result would be a 
powerful building block for multiuser quantum Shannon theory. 
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Classical channel coding 



This appendix contains the proof of the classical packing lemma (Section A. 2) and a 
brief review on some of the properties of typical sets. 



A.l Classical typicality 



In Section |2.2[ we presented a number of properties of typical sequences and typical 
sets that were used in the proof of the classical coding theorem. The reader is invited 
to consult |CT91j and |Willl] for the proofs. 

In this section, we review the properties of conditionally typical sets in a more 
general setting where an additional random variable U"' is present. This is the setting 



of the classical packing lemma, which will be stated and proved in Section |A.2 



Consider the probability distribution Pu{u)px\u{x\u) € V(U,X) and the chan- 
nel M = (U X X ,pY\xu{y\X:U), y). Let (f/"',X") be distributed according to the 
product distribution YYi=iPu{ui)px\u{xi\ui). Let y" denote the random variable that 
corresponds to the output of the channel when the inputs are (f/",X"). 
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Figure A.l: An illustration of the conditional dependence between the random variables 

([/", A",y"). 



Conditionally typical sets 

The input random variables (?7",X") ~ YYi=iPu{ui)px\u{xi\ui) and the channel M 
induce the following joint distribution: 



(A.l) 



i=l 



This corresponds to the assumption that the channel is memoryless, that is, the noise 
in the n uses of the channel is independent Py"\x"U"- = Py\xu- 



For any S > 0, define two sets of entropy conditionally typical sequences: 

10gPyn|Jfn[;n(y"|x'',U") 



n 



H{Y\X,U) 



< 5 



logJ9yn|X"(Z/"|M") 



n 



-H{Y\U) 



< 6 



(A.2) 
(A.3) 



where H(Y\U) = —J2xPu{u)pY\u{y\u)\ogpY\u{y\u) is the conditional entropy of the 
distribution PY\u{y\u) = Y.xPx\uix\u)pY\xu{y\x,u). 

By the definition of these typical sets, we have that the following bounds on the 
probability of the sequences within these sets: 



2-„[H(y|c/)+5] < py„^^„(y-\u^) < 2~^my\u)-s] ^ 7-/"\r|M"), (A.4) 

for any sequences m" and x". 

The channel outputs are likely to be conditionally typical sequences. More pre- 
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cisely, we have that for any e,S > 0, and sufficiently large n the expectations under f/" 
and X"|?7" obey the bounds: 

EE V py»|X"[/"(2/"|X",f/") > 1-e, (A.5) 

y"eT}"'(Y\X",U") 

E V py«|f/"(l/1?7") > 1-e. (A.6) 
j/"Gr/"\y|c/") 

Furthermore, we have the following bounds on the size of these conditionally typical 

sets: 

^a\U{Y\X,U)+S\ ^ ^j^ j^ 

<- 2n[HiY\U)+5]_ 



Conditionally typical sets 



Equations (A. 4) and (A. 7) will play a key role in the proof of the classical packing 



lemma in the next section. We restate these equations here in the language of indicator 
functions for the single and double conditionally typical sets: 



(r,,n\,,n\ 1 ^ r)^n[H{Y\U)-5] -i 



and 



V 1 r . ^ . < on[H(Y\X,U)+5] 

2-^ {j/"eri"\y|x",«")} - ^ 



A. 2 Classical packing lemma 

The packing lemma is a powerful tool for proving capacity theorems |EGK10j . We 
give a proof of a packing lemma which, instead of the usual jointly typical sequences 
argument, uses the properties of conditionally typical sets. This non-standard form of 
the packing lemmas is preferred because it highlights the similarities with its quantum 



analogue, the quantum conditional packing lemma stated in Appendix B.2 
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Lemma A.l (Classical conditional packing lemma). Let pu(u)px\uixW) ^ V{U,X) 
be an arbitrary code distribution, and let N' = {U x X ,pY\xu{y\XiU), y) be a channel. 
Let (f/'^,X",X") be distributed according to YYi=iPu{ui)px\u{xi\ui)px\u{S:i\ui) . LetY"' 
denote the random variable that corresponds to the output of the channel when the 
inputs are {U"',X^). Define S2 to be the event that the output will be part of the 
conditionally typical set %^'^\y\X'"',U"^), given that it is part of the output-typical set 
Y"" eT}'"\Y\U''). We have that 

E Pr {£2} = 

= E E E Pr ((f" G 7;("Hr|x",f/")) n |r" G 7;(")(F|f/")|| 

C/" X" xn yn|X" 1 1 J L J J 

^ 2~nlI{X;Y\U)-S{e)]_ ^j^ g^ 

Consider the random codebook {X"'{m)}, m G [1 : 2"'^] generated randomly and 
independently according to YYi=iPx\u{xi\ui) ■ There exists 5(e) — j- as e — )■ such that 
the probability that the conditionally typical decoding will misinterpreting the channel 
output for X^{m) incorrectly as produced by X"'{m') for some m' 7^ m, that is, 

r'^(m) = Ar"(f/",X"(m)), r"(m) G r}''\Y\X''{m'),U'') and F"(m) G r}''\Y\U''), 

vanishes as n ^ 00, if R < I{X;Y\U) — S{e), where the mutual information is 
calculated on the induced joint probability distribution {U,X,Y) ~ puxY{u,x,y) = 
PY\xu{y\x,u)px\u{x\u)puiu). 



The description of the error event in the conditional packing lemma contains four 
sources of randomness. First we have f/" ~ Il^Pc/' then there are two independent 
draws from n"^P^|c/ produce X" and X". Finally, the channel-randomness produces 
Y^ = A/'"(f/", X"). The fact that X" and X" are conditionally independent given U"' 
implies that Y^ and X" are also conditionally independent given f/". The situation is 



illustrated in Figure A. 2 



Proof. We give an argument based on the properties of the output-typical sequences 
and a cardinality bound on the conditionally typical sets. Assume that the output 
sequence = 7V"(f/",X") is output-typical (g Te"^\Y\U'')), and happens to also 
fall in the conditionally typical set for some other codeword Te"\Y\X^U'^). This is 
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ITpu 




■'"{ri"'(y|x",i7")} 



^{7;<"-'(y|(7")} 



ANd\* 1{£2} 



Figure A. 2: The classical packing lemma. Two random codewords X" and X" are drawn 
randomly and independently conditional on a third random variable f/". Assume that the 
random variable C/" is also available at the receiver. What is the chance that the output 
of the channel which corresponds to X and C/"" will falsely be recognized to be in the set of 
outputs which are likely to come from inputs: A" and C/"? The receiver performs two tests 
on the output sequence Y": (1) test membership in Te "\Y\U"') and (2) test membership in 
ri"^(y|A",;7"). if both these are successful, the outcome will be a misidentification error £2- 



described by the following event: 



(A.9) 



Now consider the expectation of the probability of the event S2 under the code 
randomness: 

E E E Pr {£2} = 

IJn n 

= E E E Pr ||y"G7;(")(y|A"c/")|n|y"G7;(")(y|[/'^)U 

|j;"eri"^{y|x"(7")} ' "'"{5"eri"^(y|c/")} 

,,,71 r.ji I J 



< F E y2-"[^(^l^)-^'(^)l lr_„ _r„. • 1 



< 2 



-n[H{y|(7)-5'(e)] 



u"-,x" 
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^ Pt;nxn(n",:E")|7;(")(y|xV) 

I ^-n[H{Y\U)-5'{e)] 2nlH{Y\XU)+S" (e)] 
^ 2-nlIiX;Y\Uy5ie)]_ 

The equality ® follows from the definition of the conditional output distribution: 

PY-\U"{y''K) = ^Px-\u^{x''K)py"\X"U-{T\x'',u''). (A.IO) 

Inequality ® follows from the fact that sequence Y"' is conditionally output-typical, 
which means that < 2~"'^^^'^^'^^^^\ Inequality @ is the consequence of drop- 

ping an indicator, since in this way we could only be enlarging the set. Inequality ® 



follows from (A.4). 



The second statement in the packing lemma follows from the independence of 
the codewords and the union bound. Let the random codebook {X"(m)}, m G [1 : 
2nRj generated randomly and independently according to Y[i=iPx\u{xi\Ui). Define 
{82(171' \m)} to be the event that the channel output when message m is sent, y"(m) = 
A/'"'(f/", X"(m)) happens to fall in the conditionally typical set for some other codeword 
7;^"^(r|X"(m'),f/") and is also output-typical (g Te^'"\Y\U'')). 

£2im'\m) ={{F"(m) G r}''\Y\X''{m'),U'')}n{Y''{m) G 7;(")(F| [/")}}. (A.ll) 
If we define (E2) to be the total probability of misidentifications of this kind, we 



?et: 



I U 



Pr{(E2)} = Pt{ [J S2{m'\m) 

'A 

< 22 Pr{^2(^V)} 

m'S:Ai,m'^m 

< |_A/f|2-"[^(^''^l^)-'5Wl 
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^ 2-n[I{X;Y\U)-R-S{e)] 

Inequality © uses the union bound. Inequality ® is true because the all the codewords 
of the codebook are picked independently. 

Thus if we choose R < I{X; Y) — S{e), the probability of error will tend to zero as 
n — 7- oo. □ 



The reader is now invited to review the Notation page (xi) in the beginning of 
the thesis. This table can be used as a bridge from classical information theory to 
the quantum information theory. In Appendix |B| we will discuss the properties of 
conditionally typical projectors and prove a quantum packing lemma which follows 
exactly the same reasoning as in the classical packing lemma. 
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Quantum channel coding 



The first part of tliis appendix defines the quantum typical subspaces and conditionally 
typical projectors associated with a quantum multiple access channel problem. The 
second part of the appendix is the statement of the quantum packing lemma which is 



a direct analogue of the classical packing lemma presented in Appendix |A.2 



B.l Quantum typicality 

The concepts of entropy, and entropy-typical sets generalize to the quantum setting 
by virtue of the spectral theorem. Let Ti^ be a dimensional Hilbert space and let 
G V{l-L^) be the density matrix associated with a quantum state. The spectral 
decomposition of is denoted p^ = UAW where A is a diagonal matrix of positive 
real eigenvalues that sum to one. We identify the eigenvalues of p^ with the probability 
distribution pviy) = ^yy and write the spectral decomposition as: 

y=l 

where \ep-^y) is the eigenvector of p^ corresponding to eigenvalue Priy)- The von Neu- 
mann entropy of the density matrix p^ is 

H{B)p = -Tr{p^logp^} = H{py). (B.2) 
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Define the set of (5-typical eigenvalues according to the eigenvalue distribution py 

log pyn(?/") 



n 



-HiY) 



< 6 



(B.3) 



For a given string y"' = yiy2 . . .yi . . .yn we define the corresponding eigenvector as 



(B.4) 



where for each symbol where yi = b E {1, 2, . . . , rf^} we select the b*'^ eigenvector \ep.b). 
The typical subspace associated with the density matrix is defined as 



A^^, = span{|ep;^n): G 7;;,J. 



(B.5) 



The typical projector is defined as 



^P^,5 ~ kp;s/")(<2p;?/"l- 



(B.6) 



Note that the typical projector is linked twofold to the spectral decomposition of (B.l ) 



the sequences y"' are selected according to py and the set of typical vectors are build 
from tensor products of orthogonal eigenvectors \ep-y). 



Properties analogous to (2.3) - (2.5) hold. For any e,6 > 0, and all sufficiently 
large n we have 

rp^l^^n^n J > 1 _ e (B.7) 

2-n[H{B)p+5]j^n^ < UlsP^^'Uls < 2-"[^(^)p-'51n;^^, (B.8) 

[1 _ e]2"[-f^(^)p-^l < Tr{n"J < 2"[^(-^)p+^]. (B.9) 



The interpretation of (B.8) is that the eigenvalues of the state p are bounded between 



2-niH{B)p-5] 2-^[fi(B)p+s] ^Yie typical subspace A'^^g. 



Signal states Consider now a set of quantum states {pxa}y G X. We perform the 
spectral decomposition of each p^^ to obtain 



y=l 



y\ ' 



(B.IO) 
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where pY\x{y\xa) is the y^^ eigenvalue of pf^ and \ep^^-y) is the corresponding eigenvec- 
tor. 

We can think of {pxa} ^ classical-quantum {c-q) channel where the input is 
some Xa & X and the output is the corresponding quantum state p^^ ■ If the channel is 
memoryless, then for each input sequence x"" = X1X2 ■ ■ ■ a:„ we have the corresponding 
tensor product output state: 

n 



P.n = P.; ® P^,^ ® ■ ■ ■ ® p1" = (X) P^^ (B.ll) 



i=l 



To avoid confusion with the indices, we use i G [n] to denote the index of a symbol x in 
the sequence x" and a G [1, . . . , to denote the different symbols in the alphabet X. 

Conditionally typical projector Consider the ensemble {px{xa) , Pxa}- The choice 
of distributions induces the following classical-quantum state: 

P"""" = \^-){^a\''^pl. (B.12) 



We can now define the conditional entropy of this state as 

HiB\X), ^ Yl Pxixa)HipxJ, (B.13) 

or equivalently, expressed in terms of the eigenvalues of the signal states, the conditional 
entropy becomes 

H{B\X)p = H{Y\X) = Y,Px{xa)HiY\xa), (B.14) 
where H{Y\xa) = —J2yPY\x{y\xa)^ogpY\x{y\xa) is the entropy of the eigenvalue dis- 



tribution shown in (B.IO) 



We define the ^"-conditionally typical projector as follows: 

^Pfn,5 ^ |ep^n;y")(ep^n;y"|, (B.15) 

where the set of conditionally typical eigenvalues Tlsn ^ consists of all sequences 
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which satisfy: 



logpyn|xn(y"|x'' 



n 



-H{Y\X) 



< 6 



(B.16) 



with pyn|jfn(?/"|x") = YY^=iPY\x{yi 



The states \ep^n;y") are built from tensor products of eigenvectors for the individual 
signal states: 

|ep^n;j/") = |ep^-^;j/i) ® l^p^^m) ® ■■■ ® Vp-^^w^-, 

where the string = y\y2 . . .yi . . .yn varies over different choices of bases for l-i^ . For 
each symbol i/i = b & {1, 2, . . . , (i^} we select \ep^^-b): the b*'^ eigenvector from the 
eigenbasis of p^^ corresponding to the letter Xi = Xa & X . 



Analogous to the three properties (B.7), (B.8) and (B.9), the conditionally typical 
projector obeys: 



Ex"Tr 



B TTn 



> 1-e 



2-n[H{B\X)p+5]jTn <- jrn B rrn / 2-"[^(^l^)''-^l H 



[1 - e]2"[^(^l^)--^] < Ex"Tr 



Yin 



(B.17) 
(B.18) 

(B.19) 



MAC code Consider now a quantum multiple access channel {Xi x X2,pl^^ X2i'^^) 
and two input distributions pxi and px2- Define the random codebooks {X"(mi)}mi6Aii 
and {X2{m2)}ra2&M2 generated from the product distributions and pxj respec- 
tively. The choice of distributions induces the following classical-quantum state p^^^^B 



and the averaged output states: 



(B.20) 



Pxa = ^PX2{Xb) Pxa,X,, 



Xb 



'Xb 



^PxAXa)px^ 



P= ^ PX,{Xa) PX2{Xb) Pxa,Xb- 



(B.21) 
(B.22) 
(B.23) 
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The conditional quantum entropy H{B\XiX2)p is: 

HiB\X,X,), = PxAxa)px.Axb)H{p,^^,,), (B.24) 

and using the average states we define: 

HiB\X^),= J2 PxA^a)Hip,J, (B.25) 

H{B\X2), = J2 Px,ix,)Hip,^), (B.26) 

H{B), = Hip). (B.27) 



Similarly to equation (B.15) and for each message pair (7711,7712) we define the 



conditionally typical projector for the encoded state p^n(^ w"rm -1 to be n"s 

x-^ (1,11 ("1-2) Px1(ni-^)xr^(m2)' 

From this point on, we will not indicate the messages mi, 1712 explicitly, because the 
codewords are constructed identically for each message. 



Analogous to (2.46), the following upper bound applies: 

Ex-x" Tr{n% A < 2"[^(^l^i^2)p+<5] (B_28) 

and we can also bound from below the eigenvalues of the state p^-a^-a as follows: 

2-"[^(^i^i^=')''+^in"s x<n"s ,pX-u% < 2— [^(^i^i^2)p-5]n-^ (B.29) 



We define conditionally typical projectors for each of the averaged states: 

p-i^n;^,, (B.30) 
^1 

p., ^n;. „ (B.31) 

^2 

p -> n;!. (B.32) 

These projectors obey the standard eigenvalue upper bounds when acting on the states 
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with respect to which they are defined: 



^2 2 2 ^2 



(B.33) 
(B.34) 
(B.35) 



We have the following bounds on the rank of the conditionally typical projectors: 



Tr-fll-R \ < 2"[^(-^l^i)p+'5] 



B.36) 
B.37) 
B.38) 



Tr{n^B5 } < 2"[-^(-^)''+^l 

The encoded state Px^x^ "^^11 supported by all the typical projectors on average: 

(B.39) 

(B.40) 

(B.41) 
(B.42) 







> 1 


- e, 




1 




> 1 


- e, 








> 1 


- e, 








> 1 


— e. 
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B.2 Quantum packing lemma 



Lemma B.l. Let pu{u)px\uix\u) e V{U,X) be an arbitrary code distribution, and let 
Af — {U X X, pu^x-i H^) be a classical-quantum channel. Let X", X") be distributed 
according to Y\4=iPu{ui)px\u{xi\ui)px\u{^i\ui)- Consider the channel M' defined by the 
following map: 



B2 

U2,X2 



(B.43) 



where -u" is available as side information to the receiver and the sender. Define the 
state p„n = Ex"\u"M' {u^\ X") and the conditionally typical projectors 11^'^ for the state 
p^n and Tlp^„^^„ for the state p^^^^„. 

We want to measure the expectation of the overlap between p^^ and the operator 
n^Jn J„ ^„ n^Jn associated with some (t/^jX**). We define this quantity to be: 



So = Tr 



(B.44) 



Then S2 can be bounded as follows: 



EE E S2 < 2-"[^(^;^l^)-'^(^)l. 



(B.45) 



Let the random codebook {X'^{m)}, m e [1 : 2^*^] be generated randomly and in- 
dependently according to YYi=iPx\u{xi\Ui). Then there exists (5(e) ^0 as e — >■ such 
that the expectation of the total overlap between conditionally typical output spaces can 
be bounded from above as follows: 



(E2)= y E E E Tr 



X"(m) 



(B.46) 



To bound the expectation of the second term, define X{m) and X"(m') to be the 
two random codewords assigned to messages m and m' respectively. 
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VCvu 






S2 



G n 



Figure B.l: The quantum packing lemma. Two random codewords and are drawn 
randomly and independently conditional on a third random variable C/". Assume that the 
random variable f/" is also available at the receiver. What is the chance that the output 
of the channel which corresponds to X and C/" will falsely be recognized to be in the set of 
outputs which are likely to come from inputs X" and [/"? 



EE E^2=EE ETr 



n 



Pjjn vn 



® 



E E Tr 

E E Tr 
E E Tr 



u? E - I 



nf " nf nf p^. 



« pYjnrU 



PU",X" PU"" 



^ 2-n[H{B\U)-5] ^ ^ rj.^ 

I 2-nlHiB\U)-S] ^ ^ rj.^ 
~ U" X"'\U" 

® 2-n[H{B\U)-S]2nlH{B\U,X)+5] 
^ 2-nlI{X;Y\U)-Sie)]_ 



tB" 



Pljn X" PU" 



n 



B" 

Pun X" 



Equation (D is true by the definition Exn|[/n{p^" x"^ ~ P'^"' "^^^ inequality ® uses 
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the eigenvalue bound as in (B.18). The inequahty ® follows from 



Tr 



= Tr 
< Tr 
= Tr 



n 



PU",X" PU" PU",X" 

nf " I nf " 

Pun xn PU",X"- 



n 



The inequality ® follows from bound on the expected rank of the conditionally typical 



projector like in (B.19). 



Applications 

Holevo-Schumacher- Westmoreland (HSW) Theorem 

Given a channel {X , p^jH), if we set: 

• f/ = 

• Pu{u)px\u{x\u) =px{x) 

Pu" Pu",x" Pu" P Px" /" 

then the quantum packing lemma tells us how many conditionally typical subspaces 
we can pack inside the output-typical subspace before they start to overlap too much. 

Successive decoding for the quantum multiple access channel 

Given a quantum multiple access channel {Xi x X2, Pxi,x2,T^), we set: 

• U = Xi 

• Puiu)Px\uix\u) = px^{xi)px2ix2) 

• Pu'\x" = Px1,xl^ 

to obtain the bound on the rate R2 when using the successive decoding mi — )■ m2\mi. 
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Superposition coding 

Consider the situation in which superposition encoding is used to encode two messages 
i and m in a codebook suitable for the channel {X, p^, %): 

n 



Consider the following substitutions: 

• U = W 

• Pu{u)px\u{xW) ^ Pw{w)px\w{x\'^) 

n^" n^" n^" = n- n n- 

The packing lemma gives us a bound on the error associated with decoding a wrong 
message m (the satellite message) given that we correctly decoded I (the cloud center) . 
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This appendix contains a series of proofs which were omitted from the text in Sec- 
tion [5]4] in order to make it more readable. 



C.l Geometry of Chong-Motani-Garg rate region 



We will now prove the inequalities from Lemma 5.2 on the geometry of 7^cmg(-^) Pcmg), 
the multiple access channel for Receiver 1 in the Chong-Motani-Garg coding strategy. 
This inequality structure is important for the geometrical observations of the §a§oglu 
argument. 



Proof of Lemma 5.2 If we expand the shorthand notation of equations (5.30 ) through 



(5.32) we obtain the following inequalities. 



I{Xr,Bi\WiW2Q)<I{Xr,Bi\W2Q) < I{XiW2;Bi\Q), (C.l) 
I{Xr,Bi\WiW2Q)<I{XiW2;Bi\WiQ) < IiXiW2; Bi\Q), (C.2) 
/(Xi; Bi\WiW2Q) + Bi\Q) < I{Xr,Bi\W2Q) + /(XiVFs; Bi\WiQ). (C.3) 



Observe that W2 is independent from Wi and Xi thus 



H{XiW2) = H{X^) + H{W2), H{WiW2) = H{Wi) + H{W2). 



(C.4) 



195 



C.l Geometry of Chong-Motani-Garg rate region 

Also, since Xi is obtained from Wi, we have H(Xi) = H{XiWi) and we can add or 
subtract the random variable Wi next to Xi as needed without changing the entropy. 



The get the first part of the inequality (5.30), we observe 



I{Xi-Bi\WiW2) = I{X^;B,W2\Wi) 

= H{XiWi) + H{BiW2Wi) - H{XiBiW2Wi) - H{Wi) 

- H{WiW2) + H{WiW2) 
= H{Xi) + [H{BiW2Wi) - H{WiW2)] - H{XiBiW2Wi) 

- H{Wi) + H{Wi) + H{W2) 
< H{Xi) + [H{B^W2) - H{W2)] - H{XiBiW2Wi) + H{W2) 
= [H{Xi) + H{W2)] + H{BiW2) - H{X^BiW2Wi) - H{W2) 
= /(Xi;5i|iy2), 



where inequality follows from H{Bi\WiW2) < H{Bi\W2) (conditioning cannot increase 
entropy) . 



The second part of inequality (5.30), follows from a similar observation usin^ 
H{Bi\W2) < H{Bi). 



J(Xi; Br\W2) = HiX^W2) + H{B^W2) - H{X^B^W2) - H{W2) 
= H{XiW2) + [H{BiW2) - H{W2)] - H{XiBiW2] 
< H{XiW2) + [H{Bi)] - H{XiBiW2) 
= I{XiW2;Bi). 



For the first part of (5.31) we repeat the above argument but with extra condi- 
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tioning on the Wi system. 

I{X,;B^\W,W2) = 

= H{XiWiW2) + H{BiWiW2) - H{XiBiWiW2) - H{WiW2) 

= H{XiWiW2) + [H{BiWiW2) - H{WiW2)] - H{XiBiWiW2) 

< H{X^W2) + [H{Bi\Wi)] - H{X^BiWiW2) 

= H{XiW2) + H{BiWi) - H{XiBiWiW2) - H{Wi) 

= H{XiWiW2) + H{BiWi) - H{XiBiWiW2) - H{Wi) 

= I{X,W2;Bi\W,). 



For the second part of (5.31) we have 



I{XiW2; Bi\Wi) = H{XiWiW2) + H{BiWi) - H{XiBiWiW2) - H{Wi) 
= H{XiW2) + [H{BiWi) - H{Wi)] - H{XiBiW2) 
< H{X^W2) + H{B^) - H{X^B^W2) 
= I{X,W2;B,). 



Finally for inequality (5.32) we need to use the strong subadditivity relation 

H{BiWiW2) + H{Bi) < H{BiWi) + H{BiW2). (C.5) 

The steps are 

I{Xi;Bi\WiW2) + I{XiW2;Bi) = 

= H(XiWiW2) + H{BiWiW2) - H{XiBiWiW2) - H{WiW2) 

+ H{XiW2) + H{Bi) - H{XiBiW2) 
= [H{BiWiW2) + ^^(-Bi)] + H{XiWiW2) - H{XiBiWiW2) - H{Wi) - H{W2) 

+ H[XiW2) - H{XiBiW2) 
< [H{BiWi) + H{BiW2)] + H{XiWiW2) - H{XiBiWiW2) - H{Wi) - H{W2) 

+ H{XiW2) - H{XiBiW2) 
= H{XiWiW2) + H{BiWi) - H{XiBiWiW2) - H{Wi) 

+ H{XiW2) + H{BiW2) - H{XiBiW2) - H{W2) 
= I{XiW2; Bi\Wi) + I{Xi- Bi\W2). 
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C.2 Detailed explanation concerning moving points 



This completes the proof of Lemma |5.2[ □ 



C.2 Detailed explanation concerning moving points 



In Section 5.5.2 we used Lemma 5.3 to show that we can move any point on the (b) or 



(d) planes to an equivalent point on the (a) or (c) planes. We now give the proof. 

Proof. We have to show how to move any point in 6j U (ij \ Oj U Cj to an equivalent point 
in ttj U Cj. Because the rates Ric and R2c appear in the coordinates of both Pi and P2, 
we cannot move each point independently. Indeed §a§oglu points out that the points 
Pi and P2 are coupled by the common rates. 

A priori, we have to consider all possible starting combinations the points However, 
using the following observations we can restrict the number of possibilities significantly. 

1. If Pi ebi\ fli, then P2 G 02 U 62- 

The fact that Pi G 61 \ ai implies that equation (bl) is tight 

Pip + Pi, = /(6i), (C.6) 

and (al) is loose 

Pip < J(ai). (C.7) 

Then there exists 6 > such that the point P[ = {Rip+6, Ric—6, R2c) G "^cmgIp)- 
Suppose for a contradiction that P2 was originally in (c2 U 6/2) \ (cti U 61). The 
decrease in Pic associated with the move from Pi to P[, will have allowed us 
to increase the one of the rates for Receiver 2 which is a contradiction since 
we assumed the P2 = P2C + R2p was optimal. More specifically, if P2 G C2, or 
P2 G (i2, then we would be allowed to increase P2p by 5, to obtain P2 = (P2p + 
6,R2c,Ric — S), resulting in the operating point (Pi,P2 + 6) which contradicts 
the assumption that the initial rate pair (Pi, P2) was on the boundary of T^cmg- 
Thus, if Pi G 61 \ Oi, then P2 must be in 02 U 62- 

2. If Pi G rfi \ (ai U 61 U ci) then P2 G 02. 

Again consider moving the rates to obtain P[ = (Pip + S, Ric — S, R2c) G c/i \ (ai U 
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61 Uci), then if then if P2 was originally in C2 or d2, then the decrease in Ric would 
allow us to move the point P2 to a new rate triple P2 = {R2p + ^, R2c-, Ric — S), 
resulting in the operating point (_Ri, R2 + 6), which again leads to a contradiction. 
Therefore P2 can only be in 02 or 62- But if P2 were in 62, then by observation 1 
(with a change of roles between Pi and P2) we would have Pi G (ai U bi) which 
contradicts our assumption that Pi E di\ {ai U 61 U Ci). Thus we see that if 
Pi E di \ (ai U 61 U ci), then P2 G 02- 

By the above reasoning we have restricted the possible combinations where the 



points (Pi, P2) could lie initially. To prove Theorem 5.3, we have to show that we can 
deal with the following combinations: 61 x 02, ai x 62, &i x c^i x cli and ai x ^2- 



We now show that we can move any point Pi G 61 U di (on one of the bad planes) 
to an equivalent point lying in ai U ci, 

• Case (Pi,P2) G 61 X 02: 

In this case, equations (bl) and (a2) are tight which means that the rate pairs 
are of the form 

Pi = (Pip, Pic, P2c), such that Pip + Pic = /(&i), 
P2 = (P2p, P2C, Pic) = (-^(^2), P2C, Pic)- 



If we apply a Pic — ?■ Pip rate moving operation to Pi we can obtain a new point 
P[ with 

P[ = {R'lp, R'i„ P2c) = (/(ai), I{bi) - /(ai), P2c) G ai n 61. 

As a result of the moving the point P2 will be moved to 

P2 = iR2p,R2c,R'ic) = (/(a2),P2c,/(&i) -/(ai)), 

which continues to lie in the 02 plane. Observe that during this rate moving 
operation the sum rates remain unchanged (Pip + Pic, P2p + P2c) = (-Ri, R2) = 

{R'lp + P'lc, P2P + -R2c)- 

The case when (Pi, P2) G ai x 62 is analogous. 
• Case (Pi,P2) G 61 X 62= 
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C.2 Detailed explanation concerning moving points 



Our starting points are 

Pi = (-Rip, Ric, -R2c), such that Rip + Ric ^ I {hi), 

P-2 = {R2p, R2c, Rlc) , such that R2p + R2c = -^(^'2) ■ 

We will first do a i?ic — )■ Rip rate moving operation until we get to the plane ai. 
The points we obtain are 

P[ = (i?;^, i?2c) = (/(ai), 7(61) - 7(ai), i?2c) e ai n 61, 

= (i?2p, i?2c, i?'le) = {P2p, R2c, HW) - I{ai)) G 62- 



We then perform second rate moving operation i?2c R2p in order to move to 
the plane 02- 

Pi' = {R[p,R\„R'^,) = {I{ai),I{bi)-I{ai),I{b2)-I{a2))eainbi, 
P^ = ^'ic) = (^(02), /(&2) - /(as), /(61) - I{ai)) e ai n 62- 



Thus we have managed to move the points (Pi, P2) G 61 x 62 to equivalent points 
(P", P2') e Oi X 02 while leaving the sum rate (Pi, P2) unchanged. 

• Case (Pi,P2) e di X 02: 

If Pi e di, it means that the triple sum inequality (dl) is tight. The starting 
rates are 

Pi = (-Rip, -Ric, -R2c), such that Pip + Pic + P2c = I{di), 
P2 = (/(a2),P2c,-Ric) e a2. 



To move Pi away from the interior of the di plane we will once again use a rate 
moving operation Pic Rip. This operation will increase the rate Pip at the 
expense of the rate Pic- We cannot increase the rate Pip indefinitely - sooner or 
later one of the two other rate constraints on Pip will saturate. 

The other constraints on Pip come from equations (al) and (cl), so by rate 
moving we will eventually reach either the Oi or the Ci planes. 
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If the first case the resulting points will be 

P[ = {R'lp, -R'lc, R2c) = -R'lc) -^2c) G fli n (ii, 

where R[^ = I{di) — /(ai) — R2c because by rate moving we stayed in the di 
plane. 

In the latter case where moving the rates of Pi E di puts us on the ci plane the 
resulting points will be 

P[ = (i?;^, R2c) eciHdu s.t. R[p + R2c = /(ci) 

P2 = (/(a2),-R2c,-Ric) ^ 

Once again, the sum rate {Ri,R2) remains unchanged by the rate moving, but 
the moved points {P[, P2) are now either in ai x 02 or ci x 02 as claimed. 

The case when (Pi, P2) G ai x ^2 is analogous. 

Therefore, given an arbitrary point (^1,^2) G 97^cmg(A/', Pcmg), there always 
exists a choice of common/private rates such that (Pi,P2) G ai U Ci x 02 U C2 with 
(Pip + Pic, P2p + P2c) = {Ri, -^2)- 

□ 



C.3 Redundant inequality 



In Section 5.5.3, we claimed that the inequality (5.49) is less tight than the sum rate 



constraint obtained by adding equations (5.48) and (5.51). 



To that this is true, consider the following argument starting from the positivity 
of the mutual information I(Wi; M/2I-B1) > 0: 



H{WiW2Bi) + H{Bi) < H{WiBi) + H{W2Bi). 



{Ci 
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We now add H{XiWiW2) and subtract —H{XiWiW2Bi) on both sides of the equation: 

H{WiW2Bi) + H{Bi) + H{XiWiW2) ^ H{WiBi) + H{W2Bi) + H{XiWiW2) 
-H{XiWiW2Bi) ~ -H{XiWiW2Bi). 

We now use the fact that W2 is independent from Wi, so H{Wi)-H{WiW2) = -H{W2) 
to obtain: 

H(WiW2Bi) + H{Bi) + H{XiWiW2) ^ H{WiBi) + H{W2Bi) + //(XiVFiVFz) 
-H{XiWiW2Bi) + H{Wi) - H{WiW2) ~ -H{XiWiW2Bi) - H{W2). 

We move the term H{WiBi) to the other side and rearrange the terms the final 
expression: 

H{XiWiW2) + H{WiW2Bi) -H{XiWiW2Bi) -H{WiW2) 
+H{Wi) + H{Bi) - H{WiBi) 

^ H{XiWiW2) + H{W2Bi) 
-H{XiWiW2Bi) - H{W2) 



I{ai) = I{Xr,Bi\WiW2)+I{Wi-Bi) < IiXi;Bi\W2) = Iibi), 



which shows that we can drop the constraint from equation (5.49). 
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